Cedric Nugteren
492ee3d0a5
Removed the invert routine from the tests
2017-02-25 12:28:13 +01:00
Cedric Nugteren
bdc57221bd
Added simple tests for the OverrideParameters function
2017-02-14 21:09:00 +01:00
Cedric Nugteren
c248f900c0
Merge branch 'development' into triangular_solvers
2017-02-05 22:18:59 +01:00
Cedric Nugteren
a5fd2323b6
Added prototype for the TRSV routine
2017-01-20 11:30:32 +01:00
Cedric Nugteren
4b3ffd9989
Added a first version of the diagonal block invert routine in preparation of TRSM
2017-01-15 17:30:00 +01:00
Cedric Nugteren
ff2bf985a3
Updated the link to cl.hpp in the Khronos registry for the samples
2017-01-07 13:57:23 +01:00
Cedric Nugteren
681a465b35
Prepared for the addition of the TRSM triangular solver kernel
2016-12-18 12:30:16 +01:00
Cedric Nugteren
2cf7d8429a
Updated to version 0.10.0
2016-11-27 13:34:18 +01:00
Cedric Nugteren
39c49bf4f9
Made it possible to use the command-line environmental variables for each executable and without re-running CMake
2016-11-27 11:00:29 +01:00
Cedric Nugteren
2ff3f77392
Made the Netlib SGEMM example also optionally compiled
2016-11-23 22:07:11 +01:00
Cedric Nugteren
fa42befcc1
Made compilation of the Netlib CBLAS API conditional
2016-11-23 21:33:35 +01:00
Cedric Nugteren
bb14a5880e
Added an example and documentation for the Netlib CBLAS API
2016-10-25 20:37:33 +02:00
Cedric Nugteren
8ae8ab06a2
Renamed the include and source files of the Netlib CBLAS API
2016-10-25 20:33:10 +02:00
Cedric Nugteren
140121ef91
Removed the clblast namespace from the Netlib C API source file to ensure proper linking
2016-10-25 20:21:50 +02:00
Cedric Nugteren
f96fd372bc
Added initial version of a Netlib CBLAS implementation. TODO: Set correct buffer sizes
2016-10-25 14:28:52 +02:00
Cedric Nugteren
fda39ffd86
Fixed the CMakeLists.txt for Visual Studio compilation
2016-10-23 14:34:46 +02:00
Cedric Nugteren
de0420dffa
Minor clean-up of the CMakeLists file
2016-10-22 16:38:42 +02:00
Cedric Nugteren
b0ff11acf0
Moved files around a bit; created a utilities subfolder
2016-10-22 15:36:48 +02:00
Cedric Nugteren
280698d076
Merge pull request #117 from intelfx/exceptions
...
Convert to use C++ exceptions internally
2016-10-22 15:05:12 +02:00
Ivan Shapovalov
b98af44fcf
treewide: use C++ exceptions properly
...
Since the codebase is designed around proper C++ idioms such as RAII, it
makes sense to only use C++ exceptions internally instead of mixing
exceptions and error codes. The exceptions are now caught at top level
to preserve compatibility with the existing error code-based API.
Note that we deliberately do not catch C++ runtime errors (such as
`std::bad_alloc`) nor logic errors (aka failed assertions) because no
actual handling can ever happen for such errors.
However, in the C interface we do catch _all_ exceptions (...) and
convert them into a wild-card error code.
2016-10-22 08:45:25 +03:00
Cedric Nugteren
597974b40d
Merge pull request #118 from matze/add-pkg-config
...
Generate and install pkg-config description
2016-10-21 21:00:07 +02:00
Matthias Vogelgesang
3797d144cc
Generate and install pkg-config description
2016-10-21 09:38:25 +02:00
Cedric Nugteren
c8d0e41e84
Added the possibility to supply the env-variable CLBLAST_TEST_ARGUMENTS to specify options for the make alltest or ctest targets
2016-10-20 23:05:16 +02:00
Cedric Nugteren
53deed298f
Added documentation and minor refactoring for the recent support of static library compilation
2016-10-15 17:11:08 +02:00
Shehzan Mohammed
0d958bf3b3
Fixes for static lib compilation on Windows
2016-10-14 18:45:34 -04:00
Cedric Nugteren
c0482ace6c
Fixed a bug where clblas.h couldn't be found for the performance tests (clients)
2016-10-14 22:11:35 +02:00
Cedric Nugteren
3386ad49c4
Set proper flags for the verbose mode (debug flags)
2016-10-14 20:54:05 +02:00
Cedric Nugteren
99a620f9a1
Merge pull request #112 from shehzan10/static
...
Add option to build shared or static library
2016-10-14 10:06:44 +02:00
Shehzan Mohammed
56f07e42b1
Add option to build shared or static library
2016-10-13 12:03:44 -04:00
Cedric Nugteren
a9d35cf04c
Merge branch 'development' into gemm_direct
2016-10-01 13:45:08 +02:00
Anton Lokhmotov
c484bb26b6
Use cross-platform thread lib idiom instead of *nix-specific pthread.
2016-09-26 21:04:28 +00:00
Anton Lokhmotov
c20a5bb7ca
Link clBLAS together with pthread.
2016-09-26 10:30:18 +00:00
Cedric Nugteren
73d135c2ce
Added a first version of a tuner for the GEMM direct kernel; collapsed MWGD, NWGD and KWGD into one WGD parameter
2016-09-25 14:48:34 +02:00
Anton Lokhmotov
750f185ba9
Add path to ref library header when building tests.
2016-09-24 11:46:34 +00:00
Cedric Nugteren
4b94afda94
Updated to version 0.9.0
2016-09-13 19:20:39 +02:00
Cedric Nugteren
48ab0428cb
Renamed the DEFAULT_DEVICE and DEFAULT_PLATFORM env variables to be in line with recent usages of CLBLAST_DEVICE and CLBLAST_PLATFORM
2016-09-13 19:08:49 +02:00
Ivan Shapovalov
9095537a6a
CMakeLists.txt: use -Wno-ignored-attributes to silence unfixable warnings
2016-09-13 16:12:30 +03:00
Cedric Nugteren
35623cd98d
Minor update regarding the previous CMake export/install target changes
2016-07-28 20:45:09 +02:00
Ivan Shapovalov
b5d7b58393
CMakeLists.txt: use target_include_directories()
2016-07-28 19:09:29 +03:00
Ivan Shapovalov
570cbcffa7
CMakeLists.txt: provide a find_package() config for dependent projects
2016-07-28 19:09:29 +03:00
Ivan Shapovalov
a1d80e7402
CMakeLists.txt: use ${clblast_SOURCE_DIR} instead of ${CMAKE_SOURCE_DIR}
2016-07-22 11:15:52 +03:00
Cedric Nugteren
27854070b4
Added a VERBOSE mode to debug performance: now prints details about compilation and kernel execution to screen
2016-07-06 21:50:12 +02:00
CNugteren
2d665099ef
Fixed a linking issue with the tuners on Visual Studio
2016-07-04 19:46:14 +02:00
Cedric Nugteren
b330ab0866
Added declspec(dllexport) to ClearCache and FillCache, and added declspec(dllimport) when not building the library
2016-06-30 10:49:17 +02:00
Cedric Nugteren
577f0ee117
Updated to version 0.8.0
2016-06-28 21:32:00 +02:00
CNugteren
871b576c06
Made it possible to build the clients and tests on Windows using Visual Studio
2016-06-28 16:38:45 +02:00
Cedric Nugteren
ca386f9883
Added fp16 to the alltuners target
2016-06-27 11:46:33 +02:00
Cedric Nugteren
61203453aa
Renamed all C++ source files to .cpp to match the .hpp extension better
2016-06-19 13:55:49 +02:00
Cedric Nugteren
f726fbdc9f
Moved all headers into the source tree, changed headers to .hpp extension
2016-06-18 20:20:13 +02:00
Cedric Nugteren
bacb5d2bb2
Clean-up of the routine class, moved RunKernel to the routine/common file
2016-06-18 18:16:14 +02:00
Cedric Nugteren
52ccaf5b25
Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and/or transposing
2016-06-16 18:07:46 +02:00
Cedric Nugteren
b894611ad1
Re-organised the level-3 supporting kernels (copy, pad, transpose, convert) and renamed files and functions appropriately
2016-06-14 18:17:58 +02:00
Cedric Nugteren
6d6b030053
Made the CPU BLAS library the default reference to test against in favor of clBLAS
2016-06-08 09:21:39 +02:00
Cedric Nugteren
7a7873d552
Fixed the RPATH settings for linking on OS X
2016-06-06 13:40:52 +02:00
Cedric Nugteren
983df6a8b4
Made use of CMake's built-in unit testing, allowing all tests to be run using 'make test'
2016-05-31 20:53:55 +02:00
Cedric Nugteren
305bf16c4c
Separated the performance tests (clients) from the correctness tests in CMake
2016-05-30 16:38:26 +02:00
Cedric Nugteren
489c5d76cf
Merged in latest changes from 0.7.1 release
2016-05-18 21:32:56 +02:00
Cedric Nugteren
591e343ec9
Added an example of using the half-precision HAXPY routine
2016-05-15 20:18:34 +02:00
Cedric Nugteren
4b6bdd83a2
Added header with conversions from and to half-precision floating-point
2016-05-15 20:13:57 +02:00
Cedric Nugteren
c5730c8b43
Updated to version 0.7.0
2016-05-08 20:29:41 +02:00
Cedric Nugteren
2952390f27
Added an example to demonstrate the use of the ClearCache and FillCache functions
2016-04-29 23:33:36 +02:00
Cedric Nugteren
4f528b1730
Added sample C programs for the SASUM and DGEMV routines
2016-04-29 20:33:19 +02:00
Cedric Nugteren
82be8f211c
Moved all cache-related functions to a separate file; added a ClearCompiledProgramCache function to clear the cache
2016-04-27 16:02:13 +02:00
cnugteren
16a048f1ac
Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routines
2016-04-20 22:12:51 -06:00
cnugteren
8be99de82d
Added support for the SASUM/DASUM/ScASUM/DzASUM routines
2016-04-14 19:58:26 -06:00
cnugteren
c2cfee76c4
Properly set warning flags for Clang
2016-04-04 08:39:13 -07:00
cnugteren
1a82861a90
Added support for testing (performance and correctness) against a CPU BLAS library
2016-04-02 11:58:00 -07:00
cnugteren
a2056f2216
Create a first version of CPU BLAS detection in CMake
2016-03-31 22:22:29 -07:00
cnugteren
8c3c6db7d0
Merge branch 'level1_routines' into development
2016-03-30 21:37:56 -07:00
cnugteren
6578102ae9
CMake now downloads the cl.hpp header from the Khronos website when building the samples
2016-03-30 16:24:38 -07:00
Cedric Nugteren
aaa687ca98
Added preliminary support for the xNRM2 routines
2016-03-28 23:00:44 +02:00
Cedric Nugteren
706c6987c6
Fixed compilation of the two SGEMM samples
2016-03-23 20:31:25 +01:00
Cedric Nugteren
bf4bd072e2
Updated to version 0.6.0
2016-03-13 11:02:40 +01:00
Cedric Nugteren
306bf67660
Added preliminary support for xHPR2 and xSPR2 routines
2016-03-06 15:48:11 +01:00
Cedric Nugteren
60da54da5d
Added preliminary support for xHER2 and xSYR2 routines
2016-03-02 21:18:01 +01:00
Cedric Nugteren
e3545215a5
Added support for xHER, xHPR, xSYR, and xSPR routines
2016-02-28 14:16:48 +01:00
Cedric Nugteren
6dc44da07b
Added support for xGERU and xGERC routines
2016-02-20 14:15:41 +01:00
Cedric Nugteren
8854a73127
Added XGER routine, kernel, and tuner
2016-02-20 12:40:01 +01:00
Cedric Nugteren
bb985f010b
Changed the order of tuners in the alltuners target
2016-02-06 12:48:42 +01:00
CNugteren
9622d3be22
Fixes for compilation under Visual Studio
2016-01-30 14:57:49 +01:00
Cedric Nugteren
44fb40e5c4
Prepared for MSVC support
2016-01-30 11:54:29 +01:00
CNugteren
92404035e8
Updated to version 0.5.0
2015-10-17 15:48:13 +02:00
CNugteren
2b56c2c603
Added TRMV/TBMV/TPMV routines
2015-09-26 16:58:03 +02:00
CNugteren
de6547a92b
Added SBMV and SPMV routines
2015-09-19 18:01:19 +02:00
CNugteren
80da67d28b
Added the HPMV routine
2015-09-19 17:40:38 +02:00
CNugteren
aebd156869
Added the HBMV routine
2015-09-19 11:11:34 +02:00
CNugteren
4507ba4997
Added first version of banded matrix-vector multiplication
2015-09-18 15:25:20 +02:00
CNugteren
a2e726d3bd
Added xDOT/xDOTU/xDOTC dot-product routines
2015-09-14 16:57:00 +02:00
CNugteren
ff0c54c386
Added the XSWAP, XSCAL and XCOPY level-1 routines
2015-08-22 17:11:20 +02:00
CNugteren
74f601794d
Updated to version 0.4.0
2015-08-22 12:41:40 +02:00
CNugteren
5f5d31754a
Added clblast prefix to binaries and added the alltests target
2015-08-21 07:36:19 +02:00
Cedric Nugteren
cf168fca70
Merge pull request #23 from CNugteren/tuner_database
...
Added initial version of a tuner-database
2015-08-20 08:38:18 +02:00
CNugteren
07e393cce4
Added target to run all tuners
2015-08-19 19:35:56 +02:00
CNugteren
a6c104ef20
Added SGEMM example using the C API
2015-08-13 13:47:15 +02:00
CNugteren
8617195ac5
Added initial version of C API with just one routine
2015-08-13 13:46:13 +02:00
CNugteren
dbdb58c600
Refactored the tuners, added JSON output
2015-08-09 15:50:41 +02:00
CNugteren
938ca2707f
Added HEMV routine
2015-07-31 17:35:42 +02:00
CNugteren
b89517a2e7
Added SYMV routine
2015-07-31 17:13:41 +02:00
CNugteren
68044254c7
Removed clBLAS source code, now requires separate installation
2015-07-31 11:06:07 +02:00
CNugteren
e4c9f4cfe5
Moved the preferred options of clBLAS (no tests) to the CLBlast CMakeLists file
2015-07-27 07:34:19 +02:00