kodonell
|
173a7eb928
|
merged
|
2018-03-27 08:55:39 +13:00 |
|
kodonell
|
d16f2d1317
|
got the generator thing working
|
2018-03-27 08:45:54 +13:00 |
|
Cedric Nugteren
|
54bbc99273
|
Updated the documentation for the tuner API
|
2018-03-10 14:52:40 +01:00 |
|
Cedric Nugteren
|
3d2ef9331b
|
Fixed a few things for the new tuning API
|
2018-03-10 14:35:11 +01:00 |
|
Cedric Nugteren
|
bff64917bd
|
Fixed some small issues regarding PR#253
|
2018-03-03 10:43:12 +01:00 |
|
sivagnanamn
|
1433dc67f1
|
Added C API for getting GEMM temp buffer size
|
2018-03-03 03:00:17 +09:00 |
|
Cedric Nugteren
|
9699169cdf
|
Added API documentation for two missing C++ functions
|
2018-02-25 14:44:22 +01:00 |
|
Cedric Nugteren
|
e784df0230
|
Renamed the API documentation
|
2018-02-24 20:46:44 +01:00 |
|
Cedric Nugteren
|
ce5e2a1e00
|
Prepared PyCLBlast for release as a package on PyPi
|
2018-02-18 18:01:02 +01:00 |
|
Cedric Nugteren
|
eb85f6b514
|
First agenerated version (clblastXswap only for now) of the pyclblast wrapper
|
2018-02-14 20:50:47 +01:00 |
|
Cedric Nugteren
|
ae66782eab
|
Fixed the XHAD documentation
|
2018-02-02 21:12:07 +01:00 |
|
Cedric Nugteren
|
ef5008f5e4
|
Created the API and stubs for the HAD (hadamard-product) routines
|
2018-01-31 20:41:02 +01:00 |
|
Cedric Nugteren
|
a500f537d8
|
Added a RetrieveParameters function to inspect tuning parameters
|
2018-01-11 20:32:06 +01:00 |
|
Cedric Nugteren
|
9fb2c61b25
|
Added API and tests for new GemmStridedBatched routine
|
2018-01-07 14:27:15 +01:00 |
|
Cedric Nugteren
|
ce069545d4
|
Added CUDA interface to get temporary-buffer size for GEMM routine
|
2018-01-06 10:05:28 +01:00 |
|
Cedric Nugteren
|
af14fff1e9
|
Updated the generator script to automatically generate the temp-buffer code
|
2018-01-04 19:31:57 +01:00 |
|
Cedric Nugteren
|
6d1e30e61f
|
Added interface to compute the required temporary buffer size for GEMM
|
2017-12-28 14:46:45 +01:00 |
|
Cedric Nugteren
|
b901809345
|
Added first (untested) version of a CUDA API
|
2017-10-11 23:16:57 +02:00 |
|
Cedric Nugteren
|
df3c9f4a8a
|
Moved non-routine-specific API functions and includes to separate files
|
2017-10-08 21:52:02 +02:00 |
|
Cedric Nugteren
|
84ec50e29d
|
Added interface and stubs for the im2col routine
|
2017-07-02 12:10:22 +02:00 |
|
Cedric Nugteren
|
615a7fdc81
|
Fixes some compilation issues related to the database structure change
|
2017-06-21 23:07:47 +02:00 |
|
Cedric Nugteren
|
f151e56daa
|
Added the IxAMIN routines: absolute minimum version of IxAMAX
|
2017-05-12 20:01:33 -07:00 |
|
Cedric Nugteren
|
409a5a2ad0
|
Fixed a namespace clash with CUDA FP16 for the half-datatype
|
2017-04-17 16:47:15 +02:00 |
|
Cedric Nugteren
|
22b3ea9256
|
Merge branch 'development' into cublas_reference
Conflicts:
scripts/generator/generator.py
|
2017-04-10 20:11:45 +02:00 |
|
Cedric Nugteren
|
2d45c37676
|
Removed const-vector-of-const-objects from the database class to remain according to the C++11 standard
|
2017-04-10 07:40:27 +02:00 |
|
Cedric Nugteren
|
52dd7433ca
|
Completed the cuBLAS wrapper
|
2017-04-06 20:56:28 +02:00 |
|
Cedric Nugteren
|
674ff96fdf
|
Added a first version of a cuBLAS wrapper (WIP)
|
2017-04-05 21:27:25 +02:00 |
|
Cedric Nugteren
|
49e04c7fce
|
Added API and test infrastructure for the batched GEMM routine
|
2017-03-10 21:24:35 +01:00 |
|
Cedric Nugteren
|
b114ea49a9
|
Added first naive version of the batched AXPY routine
|
2017-03-05 15:06:14 +01:00 |
|
Cedric Nugteren
|
f9a520b3af
|
Prepared generator for batched routines; added batched AXPY routine interface
|
2017-03-05 10:38:38 +01:00 |
|
Cedric Nugteren
|
dde67ac79e
|
Minor fix to the generator script
|
2017-02-26 14:53:58 +01:00 |
|
Cedric Nugteren
|
ea6790665d
|
Merge branch 'development' into triangular_solvers
|
2017-02-26 14:51:45 +01:00 |
|
Cedric Nugteren
|
b7310036ed
|
Removed half-precision support from the TRSM routine; too unstable
|
2017-02-26 12:56:21 +01:00 |
|
Cedric Nugteren
|
fef11a208c
|
Added documentation for the OverrideParameters function
|
2017-02-18 11:02:57 +01:00 |
|
Cedric Nugteren
|
3d10690c83
|
Added missing documentation for the fill and clear cache functions
|
2017-02-18 10:32:32 +01:00 |
|
Cedric Nugteren
|
cda449a5c3
|
Added a C interface to the OverrideParameters function; added some in-line comments to the API
|
2017-02-16 21:14:48 +01:00 |
|
Cedric Nugteren
|
08bfb75a9d
|
Added input-sanity checks for the OverrideParameters function
|
2017-02-16 21:12:50 +01:00 |
|
Cedric Nugteren
|
cdb3bb7166
|
Added first version of the OverrideParameters function
|
2017-02-13 20:53:06 +01:00 |
|
Cedric Nugteren
|
c248f900c0
|
Merge branch 'development' into triangular_solvers
|
2017-02-05 22:18:59 +01:00 |
|
Ivan Shapovalov
|
1b8e816333
|
FillCache: perform compilation for each precision separately
Thus do not prevent filling cache for float if the device does not support
e. g. double.
|
2017-01-24 02:43:00 +03:00 |
|
Cedric Nugteren
|
a5fd2323b6
|
Added prototype for the TRSV routine
|
2017-01-20 11:30:32 +01:00 |
|
Cedric Nugteren
|
681a465b35
|
Prepared for the addition of the TRSM triangular solver kernel
|
2016-12-18 12:30:16 +01:00 |
|
Cedric Nugteren
|
792cc8359f
|
Fixed a vector-size related bug in the CLBlast Netlib API
|
2016-11-23 22:00:20 +01:00 |
|
Cedric Nugteren
|
26ca071480
|
Minor changes to ensure full compatibility with the Netlib CBLAS API
|
2016-11-22 08:41:52 +01:00 |
|
Cedric Nugteren
|
8ae8ab06a2
|
Renamed the include and source files of the Netlib CBLAS API
|
2016-10-25 20:33:10 +02:00 |
|
Cedric Nugteren
|
140121ef91
|
Removed the clblast namespace from the Netlib C API source file to ensure proper linking
|
2016-10-25 20:21:50 +02:00 |
|
Cedric Nugteren
|
926aca53a0
|
Made the Netlib CBLAS API use the same enums with prefixes as the regular C API of CLBlast
|
2016-10-25 19:45:57 +02:00 |
|
Cedric Nugteren
|
59183b7d79
|
Sets the proper sizes for the buffers for the Netlib CBLAS API
|
2016-10-25 19:21:49 +02:00 |
|
Cedric Nugteren
|
f96fd372bc
|
Added initial version of a Netlib CBLAS implementation. TODO: Set correct buffer sizes
|
2016-10-25 14:28:52 +02:00 |
|
Cedric Nugteren
|
3b65eace0a
|
Merge branch 'development' into netlib_blas_api
Conflicts:
scripts/generator/generator.py
scripts/generator/generator/routine.py
|
2016-10-25 09:34:24 +02:00 |
|