.. |
database
|
Updated the tuning results for Intel IvyBridge M GT2
|
2018-07-31 20:49:41 +02:00 |
kernels
|
Disabled the use of staggered indices on AMD GPUs for the new GEMMK == 1 kernels to improve performance
|
2018-07-28 14:36:33 +02:00 |
pyclblast
|
Updated pyclblast to 1.1.0 and uploaded to PyPi
|
2018-03-30 10:38:36 +02:00 |
routines
|
Small refactoring of events in TRSV substitution routine
|
2018-08-13 22:58:01 +02:00 |
tuning
|
Added print statements to indicate the 4 stages of GEMM tuning
|
2018-07-28 16:08:22 +02:00 |
utilities
|
Merge pull request #297 from tyler-utah/master
|
2018-07-23 19:43:03 +02:00 |
api_common.cpp
|
Added a RetrieveParameters function to inspect tuning parameters
|
2018-01-11 20:32:06 +01:00 |
cache.cpp
|
Now stores a shared_ptr to the Program class in the cache
|
2018-05-01 20:34:48 +02:00 |
cache.hpp
|
Now stores a shared_ptr to the Program class in the cache
|
2018-05-01 20:34:48 +02:00 |
clblast.cpp
|
Made GEMM rotation expectations kernel-specific
|
2018-04-13 22:27:11 +02:00 |
clblast_c.cpp
|
Fixed some small issues regarding PR#253
|
2018-03-03 10:43:12 +01:00 |
clblast_cuda.cpp
|
Fixes for the CUDA API
|
2018-04-20 21:50:36 +02:00 |
clblast_netlib_c.cpp
|
Name change of setting to NETLIB_PERSISTENT_OPENCL
|
2018-08-07 22:41:06 +02:00 |
clpp11.hpp
|
Fixed a wrong event issue causing error -57
|
2018-07-29 22:16:27 +02:00 |
cupp11.hpp
|
Applied feedback from Cedric from first pull request
|
2018-07-14 19:50:47 -04:00 |
cxpp11_common.hpp
|
Various fixes to make the host code and sample compile with the CUDA API
|
2017-10-14 11:43:57 +02:00 |
kernel_preprocessor.cpp
|
Fixed a warning under MSVC
|
2017-12-23 15:30:08 +01:00 |
kernel_preprocessor.hpp
|
Implemented first simple pre-processor: defines parser and loop unrolling based on assumptions
|
2017-11-25 17:46:01 +01:00 |
routine.cpp
|
Eliminate a temporary Program object
|
2018-07-06 12:58:20 +01:00 |
routine.hpp
|
Now stores a shared_ptr to the Program class in the cache
|
2018-05-01 20:34:48 +02:00 |