CLBlast/src
Cedric Nugteren 6e2ab6ee96
Add tuning results for 5 devices (#526)
2024-02-08 20:33:33 +01:00
..
database Add tuning results for 5 devices (#526) 2024-02-08 20:33:33 +01:00
kernels AMAX/AMIN integer testing and bug fixes (#457) 2023-05-07 20:02:52 +02:00
pyclblast Python module mutli-platform setup (#519) 2024-01-21 10:58:38 +01:00
routines TBMV/TPMV/TRSV: Use the minimum x buffer size for copying to a temp buffer (#461) 2023-05-10 12:48:25 +02:00
tuning Fix issue with printing out-of-bounds local/global sizes for level 1 tuners 2021-05-22 20:31:12 +02:00
utilities AMAX/AMIN integer testing and bug fixes (#457) 2023-05-07 20:02:52 +02:00
api_common.cpp Made tuning API more flexible: disregards any extra parameter values 2018-10-13 17:47:29 +02:00
cache.cpp Fix a multithreading bug related to storing objects in the cache (#495) 2023-07-08 20:08:00 +02:00
cache.hpp Now stores a shared_ptr to the Program class in the cache 2018-05-01 20:34:48 +02:00
clblast.cpp Add kernel_mode option to im2col, col2im, and convgemm functions 2018-11-12 10:12:07 +09:00
clblast_c.cpp Add kernel_mode option to im2col, col2im, and convgemm functions 2018-11-12 10:12:07 +09:00
clblast_cuda.cpp Add kernel_mode option to im2col, col2im, and convgemm functions 2018-11-12 10:12:07 +09:00
clblast_netlib_c.cpp Add kernel_mode option to im2col, col2im, and convgemm functions 2018-11-12 10:12:07 +09:00
clpp11.hpp Fixes an issue under Android when the driver was already unloaded (#462) 2023-05-10 17:10:17 +02:00
cupp11.hpp Fix API inconsistency in cupp11.hpp 2022-05-23 12:45:22 +02:00
cxpp11_common.hpp Various fixes to make the host code and sample compile with the CUDA API 2017-10-14 11:43:57 +02:00
kernel_preprocessor.cpp Fix preprocessor and extend test coverage (#498) 2023-08-07 20:32:30 +02:00
kernel_preprocessor.hpp Implemented first simple pre-processor: defines parser and loop unrolling based on assumptions 2017-11-25 17:46:01 +01:00
routine.cpp Added a function to set the OpenCL kernel standard, either 1.1 or 1.2 2019-05-11 20:39:00 +02:00
routine.hpp Now stores a shared_ptr to the Program class in the cache 2018-05-01 20:34:48 +02:00