CLBlast/src
2016-05-15 16:10:56 +02:00
..
kernels Added support for staggered/shuffled offsets for GEMM to improve performance for large power-of-2 kernels on AMD GPUs 2016-05-15 14:04:34 +02:00
routines Fixed a bug in the xGEMM routine related to the event incorrectly set 2016-05-15 16:10:56 +02:00
tuning Made the default xDOT tuning size smaller 2016-05-01 14:39:44 +02:00
cache.cc Added a program cache (per-context) next to the per-device binary cache 2016-05-01 12:56:08 +02:00
clblast.cc Changed the index buffer of IxAMAX routines to unsigned int for proper buffersize checking 2016-05-01 14:03:37 +02:00
clblast_c.cc Added non-aboslute minimum counter-part IxMIN of the BLAS routine IxAMAX 2016-04-30 09:49:39 +02:00
database.cc Added XGER routine, kernel, and tuner 2016-02-20 12:40:01 +01:00
routine.cc Added support for staggered/shuffled offsets for GEMM to improve performance for large power-of-2 kernels on AMD GPUs 2016-05-15 14:04:34 +02:00
utilities.cc Set a proper default precision for the CLBlast clients 2016-02-20 14:41:53 +01:00