CLBlast/src
2018-01-11 20:32:06 +01:00
..
database Added a RetrieveParameters function to inspect tuning parameters 2018-01-11 20:32:06 +01:00
kernels Implemented the in-direct version of the strided-batched GEMM kernel 2018-01-08 21:07:01 +01:00
routines Implemented the in-direct version of the strided-batched GEMM kernel 2018-01-08 21:07:01 +01:00
tuning Fixed a vendor naming bug in the tuners and in the database 2018-01-06 17:02:58 +01:00
utilities Fixes for the CUDA backend of CLBlast 2017-12-24 12:10:55 +01:00
api_common.cpp Added a RetrieveParameters function to inspect tuning parameters 2018-01-11 20:32:06 +01:00
cache.cpp Made RemoveBySubset from the cache work with references to keys 2017-02-12 11:58:20 +01:00
cache.hpp Added platform ID to the binary program cache to prevent issues with multi-platform systems 2017-10-29 20:01:30 +01:00
clblast.cpp Added API and tests for new GemmStridedBatched routine 2018-01-07 14:27:15 +01:00
clblast_c.cpp Added API and tests for new GemmStridedBatched routine 2018-01-07 14:27:15 +01:00
clblast_cuda.cpp Added API and tests for new GemmStridedBatched routine 2018-01-07 14:27:15 +01:00
clblast_netlib_c.cpp Added interface and stubs for the im2col routine 2017-07-02 12:10:22 +02:00
clpp11.hpp Added optional temp-buffer argument to C++ interface of GEMM 2017-12-30 18:45:06 +01:00
cupp11.hpp Added optional temp-buffer argument to C++ interface of GEMM 2017-12-30 18:45:06 +01:00
cxpp11_common.hpp Various fixes to make the host code and sample compile with the CUDA API 2017-10-14 11:43:57 +02:00
kernel_preprocessor.cpp Fixed a warning under MSVC 2017-12-23 15:30:08 +01:00
kernel_preprocessor.hpp Implemented first simple pre-processor: defines parser and loop unrolling based on assumptions 2017-11-25 17:46:01 +01:00
routine.cpp Added interface to compute the required temporary buffer size for GEMM 2017-12-28 14:46:45 +01:00
routine.hpp Added interface to compute the required temporary buffer size for GEMM 2017-12-28 14:46:45 +01:00