Commit graph

178 commits

Author SHA1 Message Date
Cedric Nugteren f2477f6636 Removed spurious warning for Clang < 3.9 2017-07-12 20:58:31 +02:00
Cedric Nugteren 84ec50e29d Added interface and stubs for the im2col routine 2017-07-02 12:10:22 +02:00
Cedric Nugteren 52881f3864 Added batched GEMM example program 2017-06-29 21:15:25 +02:00
Cedric Nugteren 4e51b1e1f8 Moved and inlined some static member variables and disabled spurious clang warnings 2017-06-27 21:05:16 +02:00
Cedric Nugteren ce528a9d39 Fixed and suppresses several warnings for MSVC 2017-06-26 21:38:04 +02:00
Cedric Nugteren a823edb65f Reduced optimization level for the (non-performance critical) host-code to speed-up compilation 2017-06-26 21:36:56 +02:00
Cedric Nugteren e9d2a2f54c Updated to version 0.11.0 2017-05-02 20:29:59 +02:00
Cedric Nugteren e3bb58f602 Finalized support for performance testing against cuBLAS 2017-04-16 17:53:51 +02:00
Cedric Nugteren 0cebcbcc71 Added proper CMake searching for CUDA and cuBLAS 2017-04-03 21:45:18 +02:00
Cedric Nugteren b24d364743 Layed the groundwork for cuBLAS comparisons in the clients 2017-04-02 18:06:15 +02:00
Cedric Nugteren 49e04c7fce Added API and test infrastructure for the batched GEMM routine 2017-03-10 21:24:35 +01:00
Cedric Nugteren b114ea49a9 Added first naive version of the batched AXPY routine 2017-03-05 15:06:14 +01:00
Cedric Nugteren ea6790665d Merge branch 'development' into triangular_solvers 2017-02-26 14:51:45 +01:00
Cedric Nugteren 492ee3d0a5 Removed the invert routine from the tests 2017-02-25 12:28:13 +01:00
Cedric Nugteren bdc57221bd Added simple tests for the OverrideParameters function 2017-02-14 21:09:00 +01:00
Cedric Nugteren c248f900c0 Merge branch 'development' into triangular_solvers 2017-02-05 22:18:59 +01:00
Cedric Nugteren a5fd2323b6 Added prototype for the TRSV routine 2017-01-20 11:30:32 +01:00
Cedric Nugteren 4b3ffd9989 Added a first version of the diagonal block invert routine in preparation of TRSM 2017-01-15 17:30:00 +01:00
Cedric Nugteren ff2bf985a3 Updated the link to cl.hpp in the Khronos registry for the samples 2017-01-07 13:57:23 +01:00
Cedric Nugteren 681a465b35 Prepared for the addition of the TRSM triangular solver kernel 2016-12-18 12:30:16 +01:00
Cedric Nugteren 2cf7d8429a Updated to version 0.10.0 2016-11-27 13:34:18 +01:00
Cedric Nugteren 39c49bf4f9 Made it possible to use the command-line environmental variables for each executable and without re-running CMake 2016-11-27 11:00:29 +01:00
Cedric Nugteren 2ff3f77392 Made the Netlib SGEMM example also optionally compiled 2016-11-23 22:07:11 +01:00
Cedric Nugteren fa42befcc1 Made compilation of the Netlib CBLAS API conditional 2016-11-23 21:33:35 +01:00
Cedric Nugteren bb14a5880e Added an example and documentation for the Netlib CBLAS API 2016-10-25 20:37:33 +02:00
Cedric Nugteren 8ae8ab06a2 Renamed the include and source files of the Netlib CBLAS API 2016-10-25 20:33:10 +02:00
Cedric Nugteren 140121ef91 Removed the clblast namespace from the Netlib C API source file to ensure proper linking 2016-10-25 20:21:50 +02:00
Cedric Nugteren f96fd372bc Added initial version of a Netlib CBLAS implementation. TODO: Set correct buffer sizes 2016-10-25 14:28:52 +02:00
Cedric Nugteren fda39ffd86 Fixed the CMakeLists.txt for Visual Studio compilation 2016-10-23 14:34:46 +02:00
Cedric Nugteren de0420dffa Minor clean-up of the CMakeLists file 2016-10-22 16:38:42 +02:00
Cedric Nugteren b0ff11acf0 Moved files around a bit; created a utilities subfolder 2016-10-22 15:36:48 +02:00
Cedric Nugteren 280698d076 Merge pull request #117 from intelfx/exceptions
Convert to use C++ exceptions internally
2016-10-22 15:05:12 +02:00
Ivan Shapovalov b98af44fcf treewide: use C++ exceptions properly
Since the codebase is designed around proper C++ idioms such as RAII, it
makes sense to only use C++ exceptions internally instead of mixing
exceptions and error codes. The exceptions are now caught at top level
to preserve compatibility with the existing error code-based API.

Note that we deliberately do not catch C++ runtime errors (such as
`std::bad_alloc`) nor logic errors (aka failed assertions) because no
actual handling can ever happen for such errors.

However, in the C interface we do catch _all_ exceptions (...) and
convert them into a wild-card error code.
2016-10-22 08:45:25 +03:00
Cedric Nugteren 597974b40d Merge pull request #118 from matze/add-pkg-config
Generate and install pkg-config description
2016-10-21 21:00:07 +02:00
Matthias Vogelgesang 3797d144cc Generate and install pkg-config description 2016-10-21 09:38:25 +02:00
Cedric Nugteren c8d0e41e84 Added the possibility to supply the env-variable CLBLAST_TEST_ARGUMENTS to specify options for the make alltest or ctest targets 2016-10-20 23:05:16 +02:00
Cedric Nugteren 53deed298f Added documentation and minor refactoring for the recent support of static library compilation 2016-10-15 17:11:08 +02:00
Shehzan Mohammed 0d958bf3b3 Fixes for static lib compilation on Windows 2016-10-14 18:45:34 -04:00
Cedric Nugteren c0482ace6c Fixed a bug where clblas.h couldn't be found for the performance tests (clients) 2016-10-14 22:11:35 +02:00
Cedric Nugteren 3386ad49c4 Set proper flags for the verbose mode (debug flags) 2016-10-14 20:54:05 +02:00
Cedric Nugteren 99a620f9a1 Merge pull request #112 from shehzan10/static
Add option to build shared or static library
2016-10-14 10:06:44 +02:00
Shehzan Mohammed 56f07e42b1 Add option to build shared or static library 2016-10-13 12:03:44 -04:00
Cedric Nugteren a9d35cf04c Merge branch 'development' into gemm_direct 2016-10-01 13:45:08 +02:00
Anton Lokhmotov c484bb26b6 Use cross-platform thread lib idiom instead of *nix-specific pthread. 2016-09-26 21:04:28 +00:00
Anton Lokhmotov c20a5bb7ca Link clBLAS together with pthread. 2016-09-26 10:30:18 +00:00
Cedric Nugteren 73d135c2ce Added a first version of a tuner for the GEMM direct kernel; collapsed MWGD, NWGD and KWGD into one WGD parameter 2016-09-25 14:48:34 +02:00
Anton Lokhmotov 750f185ba9 Add path to ref library header when building tests. 2016-09-24 11:46:34 +00:00
Cedric Nugteren 4b94afda94 Updated to version 0.9.0 2016-09-13 19:20:39 +02:00
Cedric Nugteren 48ab0428cb Renamed the DEFAULT_DEVICE and DEFAULT_PLATFORM env variables to be in line with recent usages of CLBLAST_DEVICE and CLBLAST_PLATFORM 2016-09-13 19:08:49 +02:00
Ivan Shapovalov 9095537a6a CMakeLists.txt: use -Wno-ignored-attributes to silence unfixable warnings 2016-09-13 16:12:30 +03:00
Cedric Nugteren 35623cd98d Minor update regarding the previous CMake export/install target changes 2016-07-28 20:45:09 +02:00
Ivan Shapovalov b5d7b58393 CMakeLists.txt: use target_include_directories() 2016-07-28 19:09:29 +03:00
Ivan Shapovalov 570cbcffa7 CMakeLists.txt: provide a find_package() config for dependent projects 2016-07-28 19:09:29 +03:00
Ivan Shapovalov a1d80e7402 CMakeLists.txt: use ${clblast_SOURCE_DIR} instead of ${CMAKE_SOURCE_DIR} 2016-07-22 11:15:52 +03:00
Cedric Nugteren 27854070b4 Added a VERBOSE mode to debug performance: now prints details about compilation and kernel execution to screen 2016-07-06 21:50:12 +02:00
CNugteren 2d665099ef Fixed a linking issue with the tuners on Visual Studio 2016-07-04 19:46:14 +02:00
Cedric Nugteren b330ab0866 Added declspec(dllexport) to ClearCache and FillCache, and added declspec(dllimport) when not building the library 2016-06-30 10:49:17 +02:00
Cedric Nugteren 577f0ee117 Updated to version 0.8.0 2016-06-28 21:32:00 +02:00
CNugteren 871b576c06 Made it possible to build the clients and tests on Windows using Visual Studio 2016-06-28 16:38:45 +02:00
Cedric Nugteren ca386f9883 Added fp16 to the alltuners target 2016-06-27 11:46:33 +02:00
Cedric Nugteren 61203453aa Renamed all C++ source files to .cpp to match the .hpp extension better 2016-06-19 13:55:49 +02:00
Cedric Nugteren f726fbdc9f Moved all headers into the source tree, changed headers to .hpp extension 2016-06-18 20:20:13 +02:00
Cedric Nugteren bacb5d2bb2 Clean-up of the routine class, moved RunKernel to the routine/common file 2016-06-18 18:16:14 +02:00
Cedric Nugteren 52ccaf5b25 Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and/or transposing 2016-06-16 18:07:46 +02:00
Cedric Nugteren b894611ad1 Re-organised the level-3 supporting kernels (copy, pad, transpose, convert) and renamed files and functions appropriately 2016-06-14 18:17:58 +02:00
Cedric Nugteren 6d6b030053 Made the CPU BLAS library the default reference to test against in favor of clBLAS 2016-06-08 09:21:39 +02:00
Cedric Nugteren 7a7873d552 Fixed the RPATH settings for linking on OS X 2016-06-06 13:40:52 +02:00
Cedric Nugteren 983df6a8b4 Made use of CMake's built-in unit testing, allowing all tests to be run using 'make test' 2016-05-31 20:53:55 +02:00
Cedric Nugteren 305bf16c4c Separated the performance tests (clients) from the correctness tests in CMake 2016-05-30 16:38:26 +02:00
Cedric Nugteren 489c5d76cf Merged in latest changes from 0.7.1 release 2016-05-18 21:32:56 +02:00
Cedric Nugteren 591e343ec9 Added an example of using the half-precision HAXPY routine 2016-05-15 20:18:34 +02:00
Cedric Nugteren 4b6bdd83a2 Added header with conversions from and to half-precision floating-point 2016-05-15 20:13:57 +02:00
Cedric Nugteren c5730c8b43 Updated to version 0.7.0 2016-05-08 20:29:41 +02:00
Cedric Nugteren 2952390f27 Added an example to demonstrate the use of the ClearCache and FillCache functions 2016-04-29 23:33:36 +02:00
Cedric Nugteren 4f528b1730 Added sample C programs for the SASUM and DGEMV routines 2016-04-29 20:33:19 +02:00
Cedric Nugteren 82be8f211c Moved all cache-related functions to a separate file; added a ClearCompiledProgramCache function to clear the cache 2016-04-27 16:02:13 +02:00
cnugteren 16a048f1ac Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routines 2016-04-20 22:12:51 -06:00
cnugteren 8be99de82d Added support for the SASUM/DASUM/ScASUM/DzASUM routines 2016-04-14 19:58:26 -06:00
cnugteren c2cfee76c4 Properly set warning flags for Clang 2016-04-04 08:39:13 -07:00
cnugteren 1a82861a90 Added support for testing (performance and correctness) against a CPU BLAS library 2016-04-02 11:58:00 -07:00
cnugteren a2056f2216 Create a first version of CPU BLAS detection in CMake 2016-03-31 22:22:29 -07:00
cnugteren 8c3c6db7d0 Merge branch 'level1_routines' into development 2016-03-30 21:37:56 -07:00
cnugteren 6578102ae9 CMake now downloads the cl.hpp header from the Khronos website when building the samples 2016-03-30 16:24:38 -07:00
Cedric Nugteren aaa687ca98 Added preliminary support for the xNRM2 routines 2016-03-28 23:00:44 +02:00
Cedric Nugteren 706c6987c6 Fixed compilation of the two SGEMM samples 2016-03-23 20:31:25 +01:00
Cedric Nugteren bf4bd072e2 Updated to version 0.6.0 2016-03-13 11:02:40 +01:00
Cedric Nugteren 306bf67660 Added preliminary support for xHPR2 and xSPR2 routines 2016-03-06 15:48:11 +01:00
Cedric Nugteren 60da54da5d Added preliminary support for xHER2 and xSYR2 routines 2016-03-02 21:18:01 +01:00
Cedric Nugteren e3545215a5 Added support for xHER, xHPR, xSYR, and xSPR routines 2016-02-28 14:16:48 +01:00
Cedric Nugteren 6dc44da07b Added support for xGERU and xGERC routines 2016-02-20 14:15:41 +01:00
Cedric Nugteren 8854a73127 Added XGER routine, kernel, and tuner 2016-02-20 12:40:01 +01:00
Cedric Nugteren bb985f010b Changed the order of tuners in the alltuners target 2016-02-06 12:48:42 +01:00
CNugteren 9622d3be22 Fixes for compilation under Visual Studio 2016-01-30 14:57:49 +01:00
Cedric Nugteren 44fb40e5c4 Prepared for MSVC support 2016-01-30 11:54:29 +01:00
CNugteren 92404035e8 Updated to version 0.5.0 2015-10-17 15:48:13 +02:00
CNugteren 2b56c2c603 Added TRMV/TBMV/TPMV routines 2015-09-26 16:58:03 +02:00
CNugteren de6547a92b Added SBMV and SPMV routines 2015-09-19 18:01:19 +02:00
CNugteren 80da67d28b Added the HPMV routine 2015-09-19 17:40:38 +02:00
CNugteren aebd156869 Added the HBMV routine 2015-09-19 11:11:34 +02:00
CNugteren 4507ba4997 Added first version of banded matrix-vector multiplication 2015-09-18 15:25:20 +02:00