Commit graph

178 commits

Author SHA1 Message Date
Cedric Nugteren 35623cd98d Minor update regarding the previous CMake export/install target changes 2016-07-28 20:45:09 +02:00
Ivan Shapovalov b5d7b58393 CMakeLists.txt: use target_include_directories() 2016-07-28 19:09:29 +03:00
Ivan Shapovalov 570cbcffa7 CMakeLists.txt: provide a find_package() config for dependent projects 2016-07-28 19:09:29 +03:00
Ivan Shapovalov a1d80e7402 CMakeLists.txt: use ${clblast_SOURCE_DIR} instead of ${CMAKE_SOURCE_DIR} 2016-07-22 11:15:52 +03:00
Cedric Nugteren 27854070b4 Added a VERBOSE mode to debug performance: now prints details about compilation and kernel execution to screen 2016-07-06 21:50:12 +02:00
CNugteren 2d665099ef Fixed a linking issue with the tuners on Visual Studio 2016-07-04 19:46:14 +02:00
Cedric Nugteren b330ab0866 Added declspec(dllexport) to ClearCache and FillCache, and added declspec(dllimport) when not building the library 2016-06-30 10:49:17 +02:00
Cedric Nugteren 577f0ee117 Updated to version 0.8.0 2016-06-28 21:32:00 +02:00
CNugteren 871b576c06 Made it possible to build the clients and tests on Windows using Visual Studio 2016-06-28 16:38:45 +02:00
Cedric Nugteren ca386f9883 Added fp16 to the alltuners target 2016-06-27 11:46:33 +02:00
Cedric Nugteren 61203453aa Renamed all C++ source files to .cpp to match the .hpp extension better 2016-06-19 13:55:49 +02:00
Cedric Nugteren f726fbdc9f Moved all headers into the source tree, changed headers to .hpp extension 2016-06-18 20:20:13 +02:00
Cedric Nugteren bacb5d2bb2 Clean-up of the routine class, moved RunKernel to the routine/common file 2016-06-18 18:16:14 +02:00
Cedric Nugteren 52ccaf5b25 Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and/or transposing 2016-06-16 18:07:46 +02:00
Cedric Nugteren b894611ad1 Re-organised the level-3 supporting kernels (copy, pad, transpose, convert) and renamed files and functions appropriately 2016-06-14 18:17:58 +02:00
Cedric Nugteren 6d6b030053 Made the CPU BLAS library the default reference to test against in favor of clBLAS 2016-06-08 09:21:39 +02:00
Cedric Nugteren 7a7873d552 Fixed the RPATH settings for linking on OS X 2016-06-06 13:40:52 +02:00
Cedric Nugteren 983df6a8b4 Made use of CMake's built-in unit testing, allowing all tests to be run using 'make test' 2016-05-31 20:53:55 +02:00
Cedric Nugteren 305bf16c4c Separated the performance tests (clients) from the correctness tests in CMake 2016-05-30 16:38:26 +02:00
Cedric Nugteren 489c5d76cf Merged in latest changes from 0.7.1 release 2016-05-18 21:32:56 +02:00
Cedric Nugteren 591e343ec9 Added an example of using the half-precision HAXPY routine 2016-05-15 20:18:34 +02:00
Cedric Nugteren 4b6bdd83a2 Added header with conversions from and to half-precision floating-point 2016-05-15 20:13:57 +02:00
Cedric Nugteren c5730c8b43 Updated to version 0.7.0 2016-05-08 20:29:41 +02:00
Cedric Nugteren 2952390f27 Added an example to demonstrate the use of the ClearCache and FillCache functions 2016-04-29 23:33:36 +02:00
Cedric Nugteren 4f528b1730 Added sample C programs for the SASUM and DGEMV routines 2016-04-29 20:33:19 +02:00
Cedric Nugteren 82be8f211c Moved all cache-related functions to a separate file; added a ClearCompiledProgramCache function to clear the cache 2016-04-27 16:02:13 +02:00
cnugteren 16a048f1ac Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routines 2016-04-20 22:12:51 -06:00
cnugteren 8be99de82d Added support for the SASUM/DASUM/ScASUM/DzASUM routines 2016-04-14 19:58:26 -06:00
cnugteren c2cfee76c4 Properly set warning flags for Clang 2016-04-04 08:39:13 -07:00
cnugteren 1a82861a90 Added support for testing (performance and correctness) against a CPU BLAS library 2016-04-02 11:58:00 -07:00
cnugteren a2056f2216 Create a first version of CPU BLAS detection in CMake 2016-03-31 22:22:29 -07:00
cnugteren 8c3c6db7d0 Merge branch 'level1_routines' into development 2016-03-30 21:37:56 -07:00
cnugteren 6578102ae9 CMake now downloads the cl.hpp header from the Khronos website when building the samples 2016-03-30 16:24:38 -07:00
Cedric Nugteren aaa687ca98 Added preliminary support for the xNRM2 routines 2016-03-28 23:00:44 +02:00
Cedric Nugteren 706c6987c6 Fixed compilation of the two SGEMM samples 2016-03-23 20:31:25 +01:00
Cedric Nugteren bf4bd072e2 Updated to version 0.6.0 2016-03-13 11:02:40 +01:00
Cedric Nugteren 306bf67660 Added preliminary support for xHPR2 and xSPR2 routines 2016-03-06 15:48:11 +01:00
Cedric Nugteren 60da54da5d Added preliminary support for xHER2 and xSYR2 routines 2016-03-02 21:18:01 +01:00
Cedric Nugteren e3545215a5 Added support for xHER, xHPR, xSYR, and xSPR routines 2016-02-28 14:16:48 +01:00
Cedric Nugteren 6dc44da07b Added support for xGERU and xGERC routines 2016-02-20 14:15:41 +01:00
Cedric Nugteren 8854a73127 Added XGER routine, kernel, and tuner 2016-02-20 12:40:01 +01:00
Cedric Nugteren bb985f010b Changed the order of tuners in the alltuners target 2016-02-06 12:48:42 +01:00
CNugteren 9622d3be22 Fixes for compilation under Visual Studio 2016-01-30 14:57:49 +01:00
Cedric Nugteren 44fb40e5c4 Prepared for MSVC support 2016-01-30 11:54:29 +01:00
CNugteren 92404035e8 Updated to version 0.5.0 2015-10-17 15:48:13 +02:00
CNugteren 2b56c2c603 Added TRMV/TBMV/TPMV routines 2015-09-26 16:58:03 +02:00
CNugteren de6547a92b Added SBMV and SPMV routines 2015-09-19 18:01:19 +02:00
CNugteren 80da67d28b Added the HPMV routine 2015-09-19 17:40:38 +02:00
CNugteren aebd156869 Added the HBMV routine 2015-09-19 11:11:34 +02:00
CNugteren 4507ba4997 Added first version of banded matrix-vector multiplication 2015-09-18 15:25:20 +02:00
CNugteren a2e726d3bd Added xDOT/xDOTU/xDOTC dot-product routines 2015-09-14 16:57:00 +02:00
CNugteren ff0c54c386 Added the XSWAP, XSCAL and XCOPY level-1 routines 2015-08-22 17:11:20 +02:00
CNugteren 74f601794d Updated to version 0.4.0 2015-08-22 12:41:40 +02:00
CNugteren 5f5d31754a Added clblast prefix to binaries and added the alltests target 2015-08-21 07:36:19 +02:00
Cedric Nugteren cf168fca70 Merge pull request #23 from CNugteren/tuner_database
Added initial version of a tuner-database
2015-08-20 08:38:18 +02:00
CNugteren 07e393cce4 Added target to run all tuners 2015-08-19 19:35:56 +02:00
CNugteren a6c104ef20 Added SGEMM example using the C API 2015-08-13 13:47:15 +02:00
CNugteren 8617195ac5 Added initial version of C API with just one routine 2015-08-13 13:46:13 +02:00
CNugteren dbdb58c600 Refactored the tuners, added JSON output 2015-08-09 15:50:41 +02:00
CNugteren 938ca2707f Added HEMV routine 2015-07-31 17:35:42 +02:00
CNugteren b89517a2e7 Added SYMV routine 2015-07-31 17:13:41 +02:00
CNugteren 68044254c7 Removed clBLAS source code, now requires separate installation 2015-07-31 11:06:07 +02:00
CNugteren e4c9f4cfe5 Moved the preferred options of clBLAS (no tests) to the CLBlast CMakeLists file 2015-07-27 07:34:19 +02:00
CNugteren efbdcd2d90 Updated to version 0.3.0 2015-07-24 08:25:32 +02:00
CNugteren aa852bbe67 Added subfolders for the level1/2/3 routines 2015-07-12 16:57:09 +02:00
CNugteren b5d39d9d0c Added the HEMM routine, tester, and client 2015-07-12 15:11:50 +02:00
CNugteren b02876d6e9 Added the HER2K routine, tester, and client 2015-07-10 20:59:20 +02:00
CNugteren 919bba3eaf Added the HERK routine, tester, and client 2015-07-10 07:19:59 +02:00
CNugteren d9ea0c47c6 Added the TRMM routine, tester, and client 2015-07-02 07:16:04 +02:00
CNugteren e5c0edbfd7 Re-organized the test infrastructure to avoid code duplication 2015-06-28 15:52:57 +02:00
CNugteren 7c8d16147a Added the SYR2K routine, tester, and client 2015-06-26 08:12:56 +02:00
CNugteren 60a88aac86 Added the SYRK routine, tester, and client 2015-06-24 07:50:18 +02:00
CNugteren 4c2a166bc5 Added test infrastructure for AB and AC routines 2015-06-21 12:57:38 +02:00
CNugteren 985eeac503 Updated to version 0.2.0 2015-06-21 09:13:08 +02:00
CNugteren e522d1a74e Added initial version of GEMV including tester and performance client 2015-06-13 11:01:20 +02:00
CNugteren bdc3444d5c Added new tester for matrix-vector-vector routines 2015-06-11 07:39:23 +02:00
CNugteren 85c1db9322 Added initial naive version of Xgemv kernel 2015-06-10 08:44:30 +02:00
CNugteren bc5a341dfe Initial commit of preview version 2015-05-30 12:30:43 +02:00