Cedric Nugteren
c5a28cd70b
Added CLBlast version numbering to the compiled library
2018-02-11 15:31:21 +01:00
Cedric Nugteren
ef5008f5e4
Created the API and stubs for the HAD (hadamard-product) routines
2018-01-31 20:41:02 +01:00
Cedric Nugteren
37c5e8f58c
Updated to CLBlast version 1.3.0
2018-01-29 20:45:21 +01:00
Cedric Nugteren
d1d80ca131
Fixed a compilation error of the kernel-preprocessor test under MSVC
2018-01-29 20:26:25 +01:00
Cedric Nugteren
0e5eaa6eb9
Factored out the generic parts of the GEMM routine tuner
2018-01-15 21:32:51 +01:00
Cedric Nugteren
90e8e55acb
Added test for the RetrieveParameters function
2018-01-11 20:34:09 +01:00
Cedric Nugteren
9fb2c61b25
Added API and tests for new GemmStridedBatched routine
2018-01-07 14:27:15 +01:00
Cedric Nugteren
1e738db6dd
Split the database into multiple small compilation units
2017-12-27 12:04:22 +01:00
Cedric Nugteren
bd540829ea
Fixes for the CUDA backend of CLBlast
2017-12-24 12:10:55 +01:00
Cedric Nugteren
8657e90cf8
Fixed linking of the preprocessor test for MSVC
2017-12-24 11:33:47 +01:00
Cedric Nugteren
b1f52f130c
Updated the database to use the new TRSV and Invert tuners
2017-12-23 13:55:22 +01:00
Cedric Nugteren
aa7db4f987
Added TRSV block-size tuner
2017-12-23 13:34:57 +01:00
Cedric Nugteren
07a7012b0d
Added skeleton for a tuner for the invert kernel
2017-12-19 21:10:48 +01:00
Cedric Nugteren
c0c6d00b12
Added stub for a preprocessor and a corresponding compilation test
2017-11-25 10:24:05 +01:00
Cedric Nugteren
c6690df896
Made the tuners be compiled by default
2017-11-19 14:33:25 +01:00
Cedric Nugteren
8d2f7d53aa
Added a library with common tuner sources to speed-up compilation
2017-11-19 12:59:28 +01:00
Cedric Nugteren
f94d498a37
Moved compilation function to separate file; removed dependency of tuners of the CLBlast library
2017-11-17 20:57:46 +01:00
Cedric Nugteren
d9cf206979
Removed dependency on CLTune
2017-11-16 21:28:36 +01:00
Cedric Nugteren
1b2b46f2f0
Added first version of integrated and re-written auto-tuner
2017-11-15 22:49:35 +01:00
Cedric Nugteren
0cd78bb6f9
Added kernel timing functionality to the utilities
2017-11-15 22:47:06 +01:00
Cedric Nugteren
5d5e3f93bc
Updated to CLBlast version 1.2.0
2017-11-08 21:30:06 +01:00
Cedric Nugteren
b18cc9d3f1
Merge pull request #212 from CNugteren/kernel_selection_tuner
...
GEMM kernel selection tuner
2017-11-07 22:20:13 +01:00
Cedric Nugteren
9b0a435fb0
Integrated the GEMM routine tuner for kernel selection; added first tuning results
2017-11-02 21:47:14 +01:00
Cedric Nugteren
f24d611e57
Made it possible to compile the CLBlast performance clients for Android with the NDK
2017-10-29 13:02:14 +01:00
Cedric Nugteren
334a26eb12
Added initial version of a GEMM kernel selection tuner
2017-10-28 17:30:29 +02:00
Cedric Nugteren
bd57dfa435
Moved timing function to a separate file
2017-10-28 14:12:05 +02:00
Cedric Nugteren
8579b2b494
Added a DTRSM C++ interface example
2017-10-27 21:53:19 +02:00
Matthias Vogelgesang
34e537a5c1
Use GNUInstallDirs to determine install paths
...
The GNUInstallDirs module* provides variables that match the install directories
for GNU Software and allows users to override them. Without hardcoding paths
packagers can choose library paths according to distribution policies (i.e.
lib, lib64, lib<arch>, ...).
* https://cmake.org/cmake/help/v3.0/module/GNUInstallDirs.html
2017-10-23 15:54:55 +02:00
Cedric Nugteren
42dcd8fd8a
Merge pull request #204 from CNugteren/cuda_api
...
Cuda API to CLBlast
2017-10-20 12:07:30 +02:00
Cedric Nugteren
a3069a97c3
Prepared test and client infrastructure for use with the CUDA API
2017-10-15 13:56:19 +02:00
Cedric Nugteren
48133a0cd1
Added an option to choose whether to override the MSVC flags from /MT to /MD (default ON)
2017-10-14 16:26:35 +02:00
Cedric Nugteren
74d6e0048c
Added DAXPY example for the CUDA API
2017-10-14 12:23:35 +02:00
Cedric Nugteren
16b9efd605
Added first untested CUDA sample
2017-10-14 10:50:28 +02:00
Cedric Nugteren
b901809345
Added first (untested) version of a CUDA API
2017-10-11 23:16:57 +02:00
Cedric Nugteren
df3c9f4a8a
Moved non-routine-specific API functions and includes to separate files
2017-10-08 21:52:02 +02:00
Cedric Nugteren
f4c4674cf6
Updated to version 1.1.0
2017-09-30 17:19:17 +02:00
Cedric Nugteren
2ef6578961
Added first version of a small CLBlast diagnostics helper
2017-09-19 21:43:35 +02:00
Cedric Nugteren
76382ff6c1
Added the new vendor-architecture-name hierarchy to the tuners as well
2017-09-10 16:34:54 +02:00
Cedric Nugteren
91ea7fcde2
Introduced the notion of a device-architecture for the database and added device and architecture name mappings
2017-09-08 21:09:05 +02:00
Cedric Nugteren
20da5e33a8
Split the database files over multiple directories and files; first step towards separate compilation
2017-09-06 21:50:42 +02:00
Cedric Nugteren
777681dcbd
Merge branch 'master' into im_to_col
2017-08-12 20:50:00 +02:00
Cedric Nugteren
d30c459c5f
Fixed .hpp -> .h typo in CMakeLists
2017-08-12 16:11:23 +02:00
Cedric Nugteren
f6b6d7ef4b
Properly set the common test utilities in the CMake files
2017-08-12 16:07:28 +02:00
Cedric Nugteren
844e68853e
Moved some utility functions to a test-specific utility compilation-unit
2017-08-12 15:38:17 +02:00
Cedric Nugteren
d588f28dbe
Updated CMakeLists to include header files such that IDEs can locate them
2017-08-11 21:20:40 +02:00
Cedric Nugteren
eb896838b1
Updated to version 1.0.1 (bugfix release)
2017-08-08 20:35:49 +02:00
Cedric Nugteren
1155c068e9
Updated to version 1.0.0
2017-07-30 20:54:21 +02:00
Cedric Nugteren
b494df1111
Fixes warnings for Clang & AppleClang
2017-07-30 18:52:20 +02:00
Cedric Nugteren
6ceb9b7152
Fixes to AppVeyor and Travis scripts
2017-07-30 18:34:39 +02:00
Cedric Nugteren
f2477f6636
Removed spurious warning for Clang < 3.9
2017-07-12 20:58:31 +02:00