Matthias Vogelgesang
34e537a5c1
Use GNUInstallDirs to determine install paths
...
The GNUInstallDirs module* provides variables that match the install directories
for GNU Software and allows users to override them. Without hardcoding paths
packagers can choose library paths according to distribution policies (i.e.
lib, lib64, lib<arch>, ...).
* https://cmake.org/cmake/help/v3.0/module/GNUInstallDirs.html
2017-10-23 15:54:55 +02:00
Cedric Nugteren
42dcd8fd8a
Merge pull request #204 from CNugteren/cuda_api
...
Cuda API to CLBlast
2017-10-20 12:07:30 +02:00
Cedric Nugteren
a3069a97c3
Prepared test and client infrastructure for use with the CUDA API
2017-10-15 13:56:19 +02:00
Cedric Nugteren
48133a0cd1
Added an option to choose whether to override the MSVC flags from /MT to /MD (default ON)
2017-10-14 16:26:35 +02:00
Cedric Nugteren
74d6e0048c
Added DAXPY example for the CUDA API
2017-10-14 12:23:35 +02:00
Cedric Nugteren
16b9efd605
Added first untested CUDA sample
2017-10-14 10:50:28 +02:00
Cedric Nugteren
b901809345
Added first (untested) version of a CUDA API
2017-10-11 23:16:57 +02:00
Cedric Nugteren
df3c9f4a8a
Moved non-routine-specific API functions and includes to separate files
2017-10-08 21:52:02 +02:00
Cedric Nugteren
f4c4674cf6
Updated to version 1.1.0
2017-09-30 17:19:17 +02:00
Cedric Nugteren
2ef6578961
Added first version of a small CLBlast diagnostics helper
2017-09-19 21:43:35 +02:00
Cedric Nugteren
76382ff6c1
Added the new vendor-architecture-name hierarchy to the tuners as well
2017-09-10 16:34:54 +02:00
Cedric Nugteren
91ea7fcde2
Introduced the notion of a device-architecture for the database and added device and architecture name mappings
2017-09-08 21:09:05 +02:00
Cedric Nugteren
20da5e33a8
Split the database files over multiple directories and files; first step towards separate compilation
2017-09-06 21:50:42 +02:00
Cedric Nugteren
777681dcbd
Merge branch 'master' into im_to_col
2017-08-12 20:50:00 +02:00
Cedric Nugteren
d30c459c5f
Fixed .hpp -> .h typo in CMakeLists
2017-08-12 16:11:23 +02:00
Cedric Nugteren
f6b6d7ef4b
Properly set the common test utilities in the CMake files
2017-08-12 16:07:28 +02:00
Cedric Nugteren
844e68853e
Moved some utility functions to a test-specific utility compilation-unit
2017-08-12 15:38:17 +02:00
Cedric Nugteren
d588f28dbe
Updated CMakeLists to include header files such that IDEs can locate them
2017-08-11 21:20:40 +02:00
Cedric Nugteren
eb896838b1
Updated to version 1.0.1 (bugfix release)
2017-08-08 20:35:49 +02:00
Cedric Nugteren
1155c068e9
Updated to version 1.0.0
2017-07-30 20:54:21 +02:00
Cedric Nugteren
b494df1111
Fixes warnings for Clang & AppleClang
2017-07-30 18:52:20 +02:00
Cedric Nugteren
6ceb9b7152
Fixes to AppVeyor and Travis scripts
2017-07-30 18:34:39 +02:00
Cedric Nugteren
f2477f6636
Removed spurious warning for Clang < 3.9
2017-07-12 20:58:31 +02:00
Cedric Nugteren
84ec50e29d
Added interface and stubs for the im2col routine
2017-07-02 12:10:22 +02:00
Cedric Nugteren
52881f3864
Added batched GEMM example program
2017-06-29 21:15:25 +02:00
Cedric Nugteren
4e51b1e1f8
Moved and inlined some static member variables and disabled spurious clang warnings
2017-06-27 21:05:16 +02:00
Cedric Nugteren
ce528a9d39
Fixed and suppresses several warnings for MSVC
2017-06-26 21:38:04 +02:00
Cedric Nugteren
a823edb65f
Reduced optimization level for the (non-performance critical) host-code to speed-up compilation
2017-06-26 21:36:56 +02:00
Cedric Nugteren
e9d2a2f54c
Updated to version 0.11.0
2017-05-02 20:29:59 +02:00
Cedric Nugteren
e3bb58f602
Finalized support for performance testing against cuBLAS
2017-04-16 17:53:51 +02:00
Cedric Nugteren
0cebcbcc71
Added proper CMake searching for CUDA and cuBLAS
2017-04-03 21:45:18 +02:00
Cedric Nugteren
b24d364743
Layed the groundwork for cuBLAS comparisons in the clients
2017-04-02 18:06:15 +02:00
Cedric Nugteren
49e04c7fce
Added API and test infrastructure for the batched GEMM routine
2017-03-10 21:24:35 +01:00
Cedric Nugteren
b114ea49a9
Added first naive version of the batched AXPY routine
2017-03-05 15:06:14 +01:00
Cedric Nugteren
ea6790665d
Merge branch 'development' into triangular_solvers
2017-02-26 14:51:45 +01:00
Cedric Nugteren
492ee3d0a5
Removed the invert routine from the tests
2017-02-25 12:28:13 +01:00
Cedric Nugteren
bdc57221bd
Added simple tests for the OverrideParameters function
2017-02-14 21:09:00 +01:00
Cedric Nugteren
c248f900c0
Merge branch 'development' into triangular_solvers
2017-02-05 22:18:59 +01:00
Cedric Nugteren
a5fd2323b6
Added prototype for the TRSV routine
2017-01-20 11:30:32 +01:00
Cedric Nugteren
4b3ffd9989
Added a first version of the diagonal block invert routine in preparation of TRSM
2017-01-15 17:30:00 +01:00
Cedric Nugteren
ff2bf985a3
Updated the link to cl.hpp in the Khronos registry for the samples
2017-01-07 13:57:23 +01:00
Cedric Nugteren
681a465b35
Prepared for the addition of the TRSM triangular solver kernel
2016-12-18 12:30:16 +01:00
Cedric Nugteren
2cf7d8429a
Updated to version 0.10.0
2016-11-27 13:34:18 +01:00
Cedric Nugteren
39c49bf4f9
Made it possible to use the command-line environmental variables for each executable and without re-running CMake
2016-11-27 11:00:29 +01:00
Cedric Nugteren
2ff3f77392
Made the Netlib SGEMM example also optionally compiled
2016-11-23 22:07:11 +01:00
Cedric Nugteren
fa42befcc1
Made compilation of the Netlib CBLAS API conditional
2016-11-23 21:33:35 +01:00
Cedric Nugteren
bb14a5880e
Added an example and documentation for the Netlib CBLAS API
2016-10-25 20:37:33 +02:00
Cedric Nugteren
8ae8ab06a2
Renamed the include and source files of the Netlib CBLAS API
2016-10-25 20:33:10 +02:00
Cedric Nugteren
140121ef91
Removed the clblast namespace from the Netlib C API source file to ensure proper linking
2016-10-25 20:21:50 +02:00
Cedric Nugteren
f96fd372bc
Added initial version of a Netlib CBLAS implementation. TODO: Set correct buffer sizes
2016-10-25 14:28:52 +02:00