CLBlast/test
2018-11-07 19:45:07 +01:00
..
correctness Added groundwork for col2im algorithm plus first non-working version of kernel and test 2018-10-23 20:52:25 +02:00
performance Added groundwork for col2im algorithm plus first non-working version of kernel and test 2018-10-23 20:52:25 +02:00
routines Changed col2im to append to the existing im-buffer 2018-11-07 19:45:07 +01:00
diagnostics.cpp Moved timing function to a separate file 2017-10-28 14:12:05 +02:00
test_utilities.cpp Fixed the tests of OMATCOPY to include proper complex conjugation 2018-07-31 21:44:28 +02:00
test_utilities.hpp Fixed the tests of OMATCOPY to include proper complex conjugation 2018-07-31 21:44:28 +02:00
wrapper_cblas.hpp Added MKL as an alternative for CBLAS for correctness and performance comparisons 2018-06-02 17:57:45 +02:00
wrapper_clblas.hpp Removed half-precision support from the TRSM routine; too unstable 2017-02-26 12:56:21 +01:00
wrapper_cublas.hpp Fixed CUDA malloc and cuBLAS handles: cuBLAS as a performance-reference now works 2017-04-13 21:31:27 +02:00
wrapper_cuda.hpp Fix an incompatibility with CUDA's FP16 definition 2017-10-17 20:29:23 +02:00