Cedric Nugteren
e3ce21bb93
Bump to v1.6.1 ( #496 )
2023-07-09 11:24:24 +02:00
Cedric Nugteren
b0b302889c
Update to version 1.6.0 ( #475 )
2023-05-21 20:51:05 +02:00
Cedric Nugteren
221121b840
Add Github Actions CI ( #464 )
...
This replaces the old Travis CI builds with Github Actions that test on both Ubuntu and MacOS, with both Clang and GCC. The builds on macOS also run the tests and some other programs, on Ubuntu OpenCL is not working at the moment. Because these tests use new/different compilers, I fixed a few warnings and errors along the way.
2023-05-14 11:25:15 +02:00
Cedric Nugteren
0de212a56b
Update to version 1.5.3
2022-09-22 22:07:33 +02:00
Gard Spreemann
3d3492646c
Correct capitalization typo
...
The CLBlastConfig.cmake file was installed to a directory named
CLBLast (notice second capital l), which can cause issues for CMake's
search path when looking for CLBlast on the system.
This commit also fixes other occurrences of the wrong capitalization,
all of it purely cosmetic (i.e. in comments).
2021-04-30 10:27:22 +02:00
Cedric Nugteren
396ac0278a
Added CLBLAST_VERSION_MAJOR/MINOR/PATCH defines in headers to store version numbering
2020-05-12 14:43:25 +02:00
Koichi Akabe
032e3b0cc0
Add kernel_mode option to im2col, col2im, and convgemm functions
2018-11-12 10:12:07 +09:00
Cedric Nugteren
d45911b61d
Added groundwork for col2im algorithm plus first non-working version of kernel and test
2018-10-23 20:52:25 +02:00
Cedric Nugteren
2dd539f911
Removed complex numbers support for CONVGEMM
2018-07-29 10:37:14 +02:00
Cedric Nugteren
2776d76176
Added interface of batched convolution as GEMM
2018-05-05 14:06:33 +02:00
Cedric Nugteren
e7dccfa3cc
Fixed an issue for DLL linking under Windows
2018-03-10 14:57:36 +01:00
Cedric Nugteren
3d2ef9331b
Fixed a few things for the new tuning API
2018-03-10 14:35:11 +01:00
Cedric Nugteren
0bdc51e47c
Completed the API for all tuneable kernels
2018-03-10 10:54:44 +01:00
Cedric Nugteren
6397e61746
Added several more tuner API functions
2018-03-09 21:40:22 +01:00
Cedric Nugteren
0e1a152023
First version of the tuning API, added interface for copy-kernel, added sample
2018-03-06 20:52:12 +01:00
Cedric Nugteren
bff64917bd
Fixed some small issues regarding PR#253
2018-03-03 10:43:12 +01:00
sivagnanamn
1433dc67f1
Added C API for getting GEMM temp buffer size
2018-03-03 03:00:17 +09:00
Cedric Nugteren
ef5008f5e4
Created the API and stubs for the HAD (hadamard-product) routines
2018-01-31 20:41:02 +01:00
Cedric Nugteren
a500f537d8
Added a RetrieveParameters function to inspect tuning parameters
2018-01-11 20:32:06 +01:00
Cedric Nugteren
9fb2c61b25
Added API and tests for new GemmStridedBatched routine
2018-01-07 14:27:15 +01:00
Cedric Nugteren
ad197da08d
Fixed the CUDA interface: replaced nullptr with 0
2018-01-06 13:38:44 +01:00
Cedric Nugteren
ce069545d4
Added CUDA interface to get temporary-buffer size for GEMM routine
2018-01-06 10:05:28 +01:00
Cedric Nugteren
44431daecc
Added a CUDA version of the GEMM temp-buffer optional argument
2018-01-04 19:33:51 +01:00
Cedric Nugteren
ad1227c4f2
Added optional temp-buffer argument to C++ interface of GEMM
2017-12-30 18:45:06 +01:00
Cedric Nugteren
6d1e30e61f
Added interface to compute the required temporary buffer size for GEMM
2017-12-28 14:46:45 +01:00
Cedric Nugteren
cc5b475425
CUDA API now takes context and device in instead of stream
2017-10-12 12:20:43 +02:00
Cedric Nugteren
b901809345
Added first (untested) version of a CUDA API
2017-10-11 23:16:57 +02:00
Cedric Nugteren
e8f1de0265
Made the half-precision header OpenCL-independent
2017-10-09 18:30:19 +02:00
Cedric Nugteren
84ec50e29d
Added interface and stubs for the im2col routine
2017-07-02 12:10:22 +02:00
Cedric Nugteren
f151e56daa
Added the IxAMIN routines: absolute minimum version of IxAMAX
2017-05-12 20:01:33 -07:00
Cedric Nugteren
409a5a2ad0
Fixed a namespace clash with CUDA FP16 for the half-datatype
2017-04-17 16:47:15 +02:00
Cedric Nugteren
fb6c78ea07
Added a special override database for the Apple CPU implementation on OS X: this makes the test work, it does not focus on good performance
2017-04-07 07:37:30 +02:00
Cedric Nugteren
49e04c7fce
Added API and test infrastructure for the batched GEMM routine
2017-03-10 21:24:35 +01:00
Cedric Nugteren
fa0a9c689f
Make batched routines based on offsets instead of a vector of cl_mem objects - undoing many earlier changes
2017-03-08 20:10:20 +01:00
Cedric Nugteren
b114ea49a9
Added first naive version of the batched AXPY routine
2017-03-05 15:06:14 +01:00
Cedric Nugteren
f9a520b3af
Prepared generator for batched routines; added batched AXPY routine interface
2017-03-05 10:38:38 +01:00
Cedric Nugteren
ea6790665d
Merge branch 'development' into triangular_solvers
2017-02-26 14:51:45 +01:00
Cedric Nugteren
b7310036ed
Removed half-precision support from the TRSM routine; too unstable
2017-02-26 12:56:21 +01:00
Cedric Nugteren
d6538dfc25
Fixed the naming of the C API of OverrideParameters and fixed the description
2017-02-18 10:59:38 +01:00
Cedric Nugteren
cda449a5c3
Added a C interface to the OverrideParameters function; added some in-line comments to the API
2017-02-16 21:14:48 +01:00
Cedric Nugteren
08bfb75a9d
Added input-sanity checks for the OverrideParameters function
2017-02-16 21:12:50 +01:00
Cedric Nugteren
cdb3bb7166
Added first version of the OverrideParameters function
2017-02-13 20:53:06 +01:00
Cedric Nugteren
26ca071480
Minor changes to ensure full compatibility with the Netlib CBLAS API
2016-11-22 08:41:52 +01:00
Cedric Nugteren
eefe0df435
Made functions with scalar-buffers as output properly return values
2016-11-20 21:36:57 +01:00
Cedric Nugteren
8ae8ab06a2
Renamed the include and source files of the Netlib CBLAS API
2016-10-25 20:33:10 +02:00
Cedric Nugteren
729862e873
Fixed some issues preventing the Netlib CBLAS API from linking correctly
2016-10-25 19:56:42 +02:00
Cedric Nugteren
926aca53a0
Made the Netlib CBLAS API use the same enums with prefixes as the regular C API of CLBlast
2016-10-25 19:45:57 +02:00
Cedric Nugteren
f96fd372bc
Added initial version of a Netlib CBLAS implementation. TODO: Set correct buffer sizes
2016-10-25 14:28:52 +02:00
Cedric Nugteren
3b65eace0a
Merge branch 'development' into netlib_blas_api
...
Conflicts:
scripts/generator/generator.py
scripts/generator/generator/routine.py
2016-10-25 09:34:24 +02:00
Cedric Nugteren
a670c4c4bf
All enums in the C API are now prefixed with CLBlast to avoid potential name clashes with other projects
2016-10-22 16:14:56 +02:00