CLBlast/src
Cedric Nugteren 3621639b63 Added device-name removal code to handle POCL naming convention 2018-07-13 21:20:27 +02:00
..
database Added tuning results for GeForce GTX 1070 Ti 2018-07-13 21:07:32 +02:00
kernels Some potential fixes for error -54 when launching TRSV and TRSM kernels 2018-05-31 20:09:49 +02:00
pyclblast Updated pyclblast to 1.1.0 and uploaded to PyPi 2018-03-30 10:38:36 +02:00
routines Fixes for Apple OpenCL CPU implementation which requires a LWGS of 1 when barriers are present 2018-06-01 20:59:44 +02:00
tuning Added an option to run the routine tuner for a single specific GEMM size 2018-05-19 17:42:11 +02:00
utilities Added device-name removal code to handle POCL naming convention 2018-07-13 21:20:27 +02:00
api_common.cpp Added a RetrieveParameters function to inspect tuning parameters 2018-01-11 20:32:06 +01:00
cache.cpp Now stores a shared_ptr to the Program class in the cache 2018-05-01 20:34:48 +02:00
cache.hpp Now stores a shared_ptr to the Program class in the cache 2018-05-01 20:34:48 +02:00
clblast.cpp Made GEMM rotation expectations kernel-specific 2018-04-13 22:27:11 +02:00
clblast_c.cpp Fixed some small issues regarding PR#253 2018-03-03 10:43:12 +01:00
clblast_cuda.cpp Fixes for the CUDA API 2018-04-20 21:50:36 +02:00
clblast_netlib_c.cpp Created the API and stubs for the HAD (hadamard-product) routines 2018-01-31 20:41:02 +01:00
clpp11.hpp Disabled calls to clReleaseProgram under Windows to avoid segfaults when the OpenCL driver unloads first 2018-06-28 20:35:18 +09:00
cupp11.hpp Fixes for CUDA version of CLBlast 2018-06-03 10:41:57 +02:00
cxpp11_common.hpp Various fixes to make the host code and sample compile with the CUDA API 2017-10-14 11:43:57 +02:00
kernel_preprocessor.cpp Fixed a warning under MSVC 2017-12-23 15:30:08 +01:00
kernel_preprocessor.hpp Implemented first simple pre-processor: defines parser and loop unrolling based on assumptions 2017-11-25 17:46:01 +01:00
routine.cpp Eliminate a temporary Program object 2018-07-06 12:58:20 +01:00
routine.hpp Now stores a shared_ptr to the Program class in the cache 2018-05-01 20:34:48 +02:00