Cedric Nugteren
8579b2b494
Added a DTRSM C++ interface example
2017-10-27 21:53:19 +02:00
Matthias Vogelgesang
34e537a5c1
Use GNUInstallDirs to determine install paths
...
The GNUInstallDirs module* provides variables that match the install directories
for GNU Software and allows users to override them. Without hardcoding paths
packagers can choose library paths according to distribution policies (i.e.
lib, lib64, lib<arch>, ...).
* https://cmake.org/cmake/help/v3.0/module/GNUInstallDirs.html
2017-10-23 15:54:55 +02:00
Cedric Nugteren
42dcd8fd8a
Merge pull request #204 from CNugteren/cuda_api
...
Cuda API to CLBlast
2017-10-20 12:07:30 +02:00
Cedric Nugteren
a3069a97c3
Prepared test and client infrastructure for use with the CUDA API
2017-10-15 13:56:19 +02:00
Cedric Nugteren
48133a0cd1
Added an option to choose whether to override the MSVC flags from /MT to /MD (default ON)
2017-10-14 16:26:35 +02:00
Cedric Nugteren
74d6e0048c
Added DAXPY example for the CUDA API
2017-10-14 12:23:35 +02:00
Cedric Nugteren
16b9efd605
Added first untested CUDA sample
2017-10-14 10:50:28 +02:00
Cedric Nugteren
b901809345
Added first (untested) version of a CUDA API
2017-10-11 23:16:57 +02:00
Cedric Nugteren
df3c9f4a8a
Moved non-routine-specific API functions and includes to separate files
2017-10-08 21:52:02 +02:00
Cedric Nugteren
f4c4674cf6
Updated to version 1.1.0
2017-09-30 17:19:17 +02:00
Cedric Nugteren
2ef6578961
Added first version of a small CLBlast diagnostics helper
2017-09-19 21:43:35 +02:00
Cedric Nugteren
76382ff6c1
Added the new vendor-architecture-name hierarchy to the tuners as well
2017-09-10 16:34:54 +02:00
Cedric Nugteren
91ea7fcde2
Introduced the notion of a device-architecture for the database and added device and architecture name mappings
2017-09-08 21:09:05 +02:00
Cedric Nugteren
20da5e33a8
Split the database files over multiple directories and files; first step towards separate compilation
2017-09-06 21:50:42 +02:00
Cedric Nugteren
777681dcbd
Merge branch 'master' into im_to_col
2017-08-12 20:50:00 +02:00
Cedric Nugteren
d30c459c5f
Fixed .hpp -> .h typo in CMakeLists
2017-08-12 16:11:23 +02:00
Cedric Nugteren
f6b6d7ef4b
Properly set the common test utilities in the CMake files
2017-08-12 16:07:28 +02:00
Cedric Nugteren
844e68853e
Moved some utility functions to a test-specific utility compilation-unit
2017-08-12 15:38:17 +02:00
Cedric Nugteren
d588f28dbe
Updated CMakeLists to include header files such that IDEs can locate them
2017-08-11 21:20:40 +02:00
Cedric Nugteren
eb896838b1
Updated to version 1.0.1 (bugfix release)
2017-08-08 20:35:49 +02:00
Cedric Nugteren
1155c068e9
Updated to version 1.0.0
2017-07-30 20:54:21 +02:00
Cedric Nugteren
b494df1111
Fixes warnings for Clang & AppleClang
2017-07-30 18:52:20 +02:00
Cedric Nugteren
6ceb9b7152
Fixes to AppVeyor and Travis scripts
2017-07-30 18:34:39 +02:00
Cedric Nugteren
f2477f6636
Removed spurious warning for Clang < 3.9
2017-07-12 20:58:31 +02:00
Cedric Nugteren
84ec50e29d
Added interface and stubs for the im2col routine
2017-07-02 12:10:22 +02:00
Cedric Nugteren
52881f3864
Added batched GEMM example program
2017-06-29 21:15:25 +02:00
Cedric Nugteren
4e51b1e1f8
Moved and inlined some static member variables and disabled spurious clang warnings
2017-06-27 21:05:16 +02:00
Cedric Nugteren
ce528a9d39
Fixed and suppresses several warnings for MSVC
2017-06-26 21:38:04 +02:00
Cedric Nugteren
a823edb65f
Reduced optimization level for the (non-performance critical) host-code to speed-up compilation
2017-06-26 21:36:56 +02:00
Cedric Nugteren
e9d2a2f54c
Updated to version 0.11.0
2017-05-02 20:29:59 +02:00
Cedric Nugteren
e3bb58f602
Finalized support for performance testing against cuBLAS
2017-04-16 17:53:51 +02:00
Cedric Nugteren
0cebcbcc71
Added proper CMake searching for CUDA and cuBLAS
2017-04-03 21:45:18 +02:00
Cedric Nugteren
b24d364743
Layed the groundwork for cuBLAS comparisons in the clients
2017-04-02 18:06:15 +02:00
Cedric Nugteren
49e04c7fce
Added API and test infrastructure for the batched GEMM routine
2017-03-10 21:24:35 +01:00
Cedric Nugteren
b114ea49a9
Added first naive version of the batched AXPY routine
2017-03-05 15:06:14 +01:00
Cedric Nugteren
ea6790665d
Merge branch 'development' into triangular_solvers
2017-02-26 14:51:45 +01:00
Cedric Nugteren
492ee3d0a5
Removed the invert routine from the tests
2017-02-25 12:28:13 +01:00
Cedric Nugteren
bdc57221bd
Added simple tests for the OverrideParameters function
2017-02-14 21:09:00 +01:00
Cedric Nugteren
c248f900c0
Merge branch 'development' into triangular_solvers
2017-02-05 22:18:59 +01:00
Cedric Nugteren
a5fd2323b6
Added prototype for the TRSV routine
2017-01-20 11:30:32 +01:00
Cedric Nugteren
4b3ffd9989
Added a first version of the diagonal block invert routine in preparation of TRSM
2017-01-15 17:30:00 +01:00
Cedric Nugteren
ff2bf985a3
Updated the link to cl.hpp in the Khronos registry for the samples
2017-01-07 13:57:23 +01:00
Cedric Nugteren
681a465b35
Prepared for the addition of the TRSM triangular solver kernel
2016-12-18 12:30:16 +01:00
Cedric Nugteren
2cf7d8429a
Updated to version 0.10.0
2016-11-27 13:34:18 +01:00
Cedric Nugteren
39c49bf4f9
Made it possible to use the command-line environmental variables for each executable and without re-running CMake
2016-11-27 11:00:29 +01:00
Cedric Nugteren
2ff3f77392
Made the Netlib SGEMM example also optionally compiled
2016-11-23 22:07:11 +01:00
Cedric Nugteren
fa42befcc1
Made compilation of the Netlib CBLAS API conditional
2016-11-23 21:33:35 +01:00
Cedric Nugteren
bb14a5880e
Added an example and documentation for the Netlib CBLAS API
2016-10-25 20:37:33 +02:00
Cedric Nugteren
8ae8ab06a2
Renamed the include and source files of the Netlib CBLAS API
2016-10-25 20:33:10 +02:00
Cedric Nugteren
140121ef91
Removed the clblast namespace from the Netlib C API source file to ensure proper linking
2016-10-25 20:21:50 +02:00
Cedric Nugteren
f96fd372bc
Added initial version of a Netlib CBLAS implementation. TODO: Set correct buffer sizes
2016-10-25 14:28:52 +02:00
Cedric Nugteren
fda39ffd86
Fixed the CMakeLists.txt for Visual Studio compilation
2016-10-23 14:34:46 +02:00
Cedric Nugteren
de0420dffa
Minor clean-up of the CMakeLists file
2016-10-22 16:38:42 +02:00
Cedric Nugteren
b0ff11acf0
Moved files around a bit; created a utilities subfolder
2016-10-22 15:36:48 +02:00
Cedric Nugteren
280698d076
Merge pull request #117 from intelfx/exceptions
...
Convert to use C++ exceptions internally
2016-10-22 15:05:12 +02:00
Ivan Shapovalov
b98af44fcf
treewide: use C++ exceptions properly
...
Since the codebase is designed around proper C++ idioms such as RAII, it
makes sense to only use C++ exceptions internally instead of mixing
exceptions and error codes. The exceptions are now caught at top level
to preserve compatibility with the existing error code-based API.
Note that we deliberately do not catch C++ runtime errors (such as
`std::bad_alloc`) nor logic errors (aka failed assertions) because no
actual handling can ever happen for such errors.
However, in the C interface we do catch _all_ exceptions (...) and
convert them into a wild-card error code.
2016-10-22 08:45:25 +03:00
Cedric Nugteren
597974b40d
Merge pull request #118 from matze/add-pkg-config
...
Generate and install pkg-config description
2016-10-21 21:00:07 +02:00
Matthias Vogelgesang
3797d144cc
Generate and install pkg-config description
2016-10-21 09:38:25 +02:00
Cedric Nugteren
c8d0e41e84
Added the possibility to supply the env-variable CLBLAST_TEST_ARGUMENTS to specify options for the make alltest or ctest targets
2016-10-20 23:05:16 +02:00
Cedric Nugteren
53deed298f
Added documentation and minor refactoring for the recent support of static library compilation
2016-10-15 17:11:08 +02:00
Shehzan Mohammed
0d958bf3b3
Fixes for static lib compilation on Windows
2016-10-14 18:45:34 -04:00
Cedric Nugteren
c0482ace6c
Fixed a bug where clblas.h couldn't be found for the performance tests (clients)
2016-10-14 22:11:35 +02:00
Cedric Nugteren
3386ad49c4
Set proper flags for the verbose mode (debug flags)
2016-10-14 20:54:05 +02:00
Cedric Nugteren
99a620f9a1
Merge pull request #112 from shehzan10/static
...
Add option to build shared or static library
2016-10-14 10:06:44 +02:00
Shehzan Mohammed
56f07e42b1
Add option to build shared or static library
2016-10-13 12:03:44 -04:00
Cedric Nugteren
a9d35cf04c
Merge branch 'development' into gemm_direct
2016-10-01 13:45:08 +02:00
Anton Lokhmotov
c484bb26b6
Use cross-platform thread lib idiom instead of *nix-specific pthread.
2016-09-26 21:04:28 +00:00
Anton Lokhmotov
c20a5bb7ca
Link clBLAS together with pthread.
2016-09-26 10:30:18 +00:00
Cedric Nugteren
73d135c2ce
Added a first version of a tuner for the GEMM direct kernel; collapsed MWGD, NWGD and KWGD into one WGD parameter
2016-09-25 14:48:34 +02:00
Anton Lokhmotov
750f185ba9
Add path to ref library header when building tests.
2016-09-24 11:46:34 +00:00
Cedric Nugteren
4b94afda94
Updated to version 0.9.0
2016-09-13 19:20:39 +02:00
Cedric Nugteren
48ab0428cb
Renamed the DEFAULT_DEVICE and DEFAULT_PLATFORM env variables to be in line with recent usages of CLBLAST_DEVICE and CLBLAST_PLATFORM
2016-09-13 19:08:49 +02:00
Ivan Shapovalov
9095537a6a
CMakeLists.txt: use -Wno-ignored-attributes to silence unfixable warnings
2016-09-13 16:12:30 +03:00
Cedric Nugteren
35623cd98d
Minor update regarding the previous CMake export/install target changes
2016-07-28 20:45:09 +02:00
Ivan Shapovalov
b5d7b58393
CMakeLists.txt: use target_include_directories()
2016-07-28 19:09:29 +03:00
Ivan Shapovalov
570cbcffa7
CMakeLists.txt: provide a find_package() config for dependent projects
2016-07-28 19:09:29 +03:00
Ivan Shapovalov
a1d80e7402
CMakeLists.txt: use ${clblast_SOURCE_DIR} instead of ${CMAKE_SOURCE_DIR}
2016-07-22 11:15:52 +03:00
Cedric Nugteren
27854070b4
Added a VERBOSE mode to debug performance: now prints details about compilation and kernel execution to screen
2016-07-06 21:50:12 +02:00
CNugteren
2d665099ef
Fixed a linking issue with the tuners on Visual Studio
2016-07-04 19:46:14 +02:00
Cedric Nugteren
b330ab0866
Added declspec(dllexport) to ClearCache and FillCache, and added declspec(dllimport) when not building the library
2016-06-30 10:49:17 +02:00
Cedric Nugteren
577f0ee117
Updated to version 0.8.0
2016-06-28 21:32:00 +02:00
CNugteren
871b576c06
Made it possible to build the clients and tests on Windows using Visual Studio
2016-06-28 16:38:45 +02:00
Cedric Nugteren
ca386f9883
Added fp16 to the alltuners target
2016-06-27 11:46:33 +02:00
Cedric Nugteren
61203453aa
Renamed all C++ source files to .cpp to match the .hpp extension better
2016-06-19 13:55:49 +02:00
Cedric Nugteren
f726fbdc9f
Moved all headers into the source tree, changed headers to .hpp extension
2016-06-18 20:20:13 +02:00
Cedric Nugteren
bacb5d2bb2
Clean-up of the routine class, moved RunKernel to the routine/common file
2016-06-18 18:16:14 +02:00
Cedric Nugteren
52ccaf5b25
Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and/or transposing
2016-06-16 18:07:46 +02:00
Cedric Nugteren
b894611ad1
Re-organised the level-3 supporting kernels (copy, pad, transpose, convert) and renamed files and functions appropriately
2016-06-14 18:17:58 +02:00
Cedric Nugteren
6d6b030053
Made the CPU BLAS library the default reference to test against in favor of clBLAS
2016-06-08 09:21:39 +02:00
Cedric Nugteren
7a7873d552
Fixed the RPATH settings for linking on OS X
2016-06-06 13:40:52 +02:00
Cedric Nugteren
983df6a8b4
Made use of CMake's built-in unit testing, allowing all tests to be run using 'make test'
2016-05-31 20:53:55 +02:00
Cedric Nugteren
305bf16c4c
Separated the performance tests (clients) from the correctness tests in CMake
2016-05-30 16:38:26 +02:00
Cedric Nugteren
489c5d76cf
Merged in latest changes from 0.7.1 release
2016-05-18 21:32:56 +02:00
Cedric Nugteren
591e343ec9
Added an example of using the half-precision HAXPY routine
2016-05-15 20:18:34 +02:00
Cedric Nugteren
4b6bdd83a2
Added header with conversions from and to half-precision floating-point
2016-05-15 20:13:57 +02:00
Cedric Nugteren
c5730c8b43
Updated to version 0.7.0
2016-05-08 20:29:41 +02:00
Cedric Nugteren
2952390f27
Added an example to demonstrate the use of the ClearCache and FillCache functions
2016-04-29 23:33:36 +02:00
Cedric Nugteren
4f528b1730
Added sample C programs for the SASUM and DGEMV routines
2016-04-29 20:33:19 +02:00
Cedric Nugteren
82be8f211c
Moved all cache-related functions to a separate file; added a ClearCompiledProgramCache function to clear the cache
2016-04-27 16:02:13 +02:00
cnugteren
16a048f1ac
Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routines
2016-04-20 22:12:51 -06:00