Commit graph

202 commits

Author SHA1 Message Date
Gard Spreemann 3d3492646c Correct capitalization typo
The CLBlastConfig.cmake file was installed to a directory named
CLBLast (notice second capital l), which can cause issues for CMake's
search path when looking for CLBlast on the system.

This commit also fixes other occurrences of the wrong capitalization,
all of it purely cosmetic (i.e. in comments).
2021-04-30 10:27:22 +02:00
Cedric Nugteren 70016e8698 Updated to version 1.5.2 2021-01-19 21:19:12 +01:00
Cedric Nugteren 7fab29304c Added sample to play around with XAMAX routine 2020-03-08 11:26:18 +01:00
Cedric Nugteren 8433985051 Updated to version 1.5.1 2020-02-18 10:29:40 +01:00
Cedric Nugteren 560f7a40f6 Added convgemm to the CLBlast database, added initial parameters for Skylake GPU 2018-12-31 19:05:34 +01:00
Cedric Nugteren 1f0cd61824 Added first version of a tuner for the ConvGemm direct kernel 2018-12-18 13:59:26 +09:00
Cedric Nugteren 0c9411c844 Updated to version 1.5.0 2018-12-04 20:46:02 +01:00
Cedric Nugteren d45911b61d Added groundwork for col2im algorithm plus first non-working version of kernel and test 2018-10-23 20:52:25 +02:00
Cedric Nugteren 83ba3d4b7b Merge branch 'master' into convgemm_multi_kernel 2018-09-16 20:01:18 +02:00
Cedric Nugteren 9d9f09fce9 Name change of setting to NETLIB_PERSISTENT_OPENCL 2018-08-07 22:41:06 +02:00
Cedric Nugteren fe639455bd Added an option to compile the Netlib API with static OpenCL device and context 2018-08-05 21:12:39 +02:00
Cedric Nugteren 5903820ba2 Merge branch 'master' into CLBlast-267-convgemm 2018-07-29 10:26:34 +02:00
Cedric Nugteren f84036948b Renamed AMD SI workaround defines 2018-07-27 20:38:01 +02:00
Cedric Nugteren e8dea34fce Added workaround for weird AMD SI Hainan bug 2018-07-25 22:59:36 +02:00
Cedric Nugteren db179a1e40 Updated to CLBlast version 1.4.1 2018-07-14 12:29:06 +02:00
Cedric Nugteren 1c9a741470 Merge branch 'master' into CLBlast-267-convgemm 2018-06-03 15:53:27 +02:00
Cedric Nugteren 4471b67735 Updated to CLBlast version 1.4.0 2018-06-03 13:18:05 +02:00
Cedric Nugteren bd1715aff9 Fixes for CUDA version of CLBlast 2018-06-03 10:41:57 +02:00
Cedric Nugteren 4f594e3931 Added MKL as an alternative for CBLAS for correctness and performance comparisons 2018-06-02 17:57:45 +02:00
Cedric Nugteren cbcd4ff7e8 Merge branch 'master' into CLBlast-267-convgemm 2018-05-19 17:54:27 +02:00
Cedric Nugteren 76e0079a90 Fixed compilation issues 2018-05-19 14:18:23 +02:00
Cedric Nugteren 66583b3cda The GEMM routine tuner now loads kernel JSON tuning results from disk if available; now run part of alltuners target 2018-05-19 12:48:59 +02:00
Cedric Nugteren 2d1f6ba7fe Added convgemm skeleton, test infrastructure, and first reference implementation 2018-05-06 11:35:34 +02:00
Cedric Nugteren 3e3a26e0da Fixes for the CUDA API 2018-04-20 21:50:36 +02:00
Cedric Nugteren 0e1a152023 First version of the tuning API, added interface for copy-kernel, added sample 2018-03-06 20:52:12 +01:00
Cedric Nugteren c5a28cd70b Added CLBlast version numbering to the compiled library 2018-02-11 15:31:21 +01:00
Cedric Nugteren ef5008f5e4 Created the API and stubs for the HAD (hadamard-product) routines 2018-01-31 20:41:02 +01:00
Cedric Nugteren 37c5e8f58c Updated to CLBlast version 1.3.0 2018-01-29 20:45:21 +01:00
Cedric Nugteren d1d80ca131 Fixed a compilation error of the kernel-preprocessor test under MSVC 2018-01-29 20:26:25 +01:00
Cedric Nugteren 0e5eaa6eb9 Factored out the generic parts of the GEMM routine tuner 2018-01-15 21:32:51 +01:00
Cedric Nugteren 90e8e55acb Added test for the RetrieveParameters function 2018-01-11 20:34:09 +01:00
Cedric Nugteren 9fb2c61b25 Added API and tests for new GemmStridedBatched routine 2018-01-07 14:27:15 +01:00
Cedric Nugteren 1e738db6dd Split the database into multiple small compilation units 2017-12-27 12:04:22 +01:00
Cedric Nugteren bd540829ea Fixes for the CUDA backend of CLBlast 2017-12-24 12:10:55 +01:00
Cedric Nugteren 8657e90cf8 Fixed linking of the preprocessor test for MSVC 2017-12-24 11:33:47 +01:00
Cedric Nugteren b1f52f130c Updated the database to use the new TRSV and Invert tuners 2017-12-23 13:55:22 +01:00
Cedric Nugteren aa7db4f987 Added TRSV block-size tuner 2017-12-23 13:34:57 +01:00
Cedric Nugteren 07a7012b0d Added skeleton for a tuner for the invert kernel 2017-12-19 21:10:48 +01:00
Cedric Nugteren c0c6d00b12 Added stub for a preprocessor and a corresponding compilation test 2017-11-25 10:24:05 +01:00
Cedric Nugteren c6690df896 Made the tuners be compiled by default 2017-11-19 14:33:25 +01:00
Cedric Nugteren 8d2f7d53aa Added a library with common tuner sources to speed-up compilation 2017-11-19 12:59:28 +01:00
Cedric Nugteren f94d498a37 Moved compilation function to separate file; removed dependency of tuners of the CLBlast library 2017-11-17 20:57:46 +01:00
Cedric Nugteren d9cf206979 Removed dependency on CLTune 2017-11-16 21:28:36 +01:00
Cedric Nugteren 1b2b46f2f0 Added first version of integrated and re-written auto-tuner 2017-11-15 22:49:35 +01:00
Cedric Nugteren 0cd78bb6f9 Added kernel timing functionality to the utilities 2017-11-15 22:47:06 +01:00
Cedric Nugteren 5d5e3f93bc Updated to CLBlast version 1.2.0 2017-11-08 21:30:06 +01:00
Cedric Nugteren b18cc9d3f1
Merge pull request #212 from CNugteren/kernel_selection_tuner
GEMM kernel selection tuner
2017-11-07 22:20:13 +01:00
Cedric Nugteren 9b0a435fb0 Integrated the GEMM routine tuner for kernel selection; added first tuning results 2017-11-02 21:47:14 +01:00
Cedric Nugteren f24d611e57 Made it possible to compile the CLBlast performance clients for Android with the NDK 2017-10-29 13:02:14 +01:00
Cedric Nugteren 334a26eb12 Added initial version of a GEMM kernel selection tuner 2017-10-28 17:30:29 +02:00