Commit graph

14 commits

Author SHA1 Message Date
Cedric Nugteren 6f4b34f813 Added tuning parameters for various devices using the new database script 2016-02-07 16:41:09 +01:00
Cedric Nugteren 276e772a2c Added first auto-generated database headers from the Python database; only K40 and Iris supported now 2016-01-30 11:43:21 +01:00
CNugteren a2e726d3bd Added xDOT/xDOTU/xDOTC dot-product routines 2015-09-14 16:57:00 +02:00
CNugteren f7199b831f Now using the new Claduc C++11 OpenCL header 2015-07-27 07:18:06 +02:00
CNugteren dd8471ba92 Set the correct name for AMD OpenCL devices 2015-07-22 19:25:06 +02:00
CNugteren 3a6bdeb79a Updated GEMM tuning results for Tahiti 2015-07-22 07:31:39 +02:00
CNugteren 4dcecfe934 Added workgroup shuffle option to transpose kernel for AMD GPUs 2015-07-22 07:31:16 +02:00
CNugteren 250f8ab295 Fixed complex performance on Intel Iris 2015-07-19 13:39:13 +02:00
CNugteren ce703a2f5a Added tuning for DGEMV on Iris and SGEMV on K40m 2015-06-15 08:41:13 +02:00
CNugteren 294a3e3d41 Split the three variations of the GEMV kernel for maximal tuning freedom 2015-06-14 11:15:53 +02:00
CNugteren 4b3e3dcfe0 Added a fast GEMV kernel with vector loads, no tail, and fewer if-statements 2015-06-13 20:46:01 +02:00
CNugteren 9b66883e9c Improved GEMV kernel with local memory and a tunable WPT 2015-06-13 14:10:07 +02:00
CNugteren e522d1a74e Added initial version of GEMV including tester and performance client 2015-06-13 11:01:20 +02:00
CNugteren bc5a341dfe Initial commit of preview version 2015-05-30 12:30:43 +02:00