Commit graph

863 commits

Author SHA1 Message Date
Cedric Nugteren 19504ed609 Moved static variable declarations from .cpp to .hpp to resolve some Clang warnings 2017-06-25 20:59:22 +02:00
Cedric Nugteren b8df03e5bc Added CLBlast paper and presentation references in README 2017-06-25 20:45:14 +02:00
Cedric Nugteren 1a8ed48a35 Fixed some Clang and MSVC warnings 2017-06-25 11:50:36 +02:00
Cedric Nugteren 7eab65b699 Merge branch 'database_compilation_speed' 2017-06-25 10:16:52 +02:00
Cedric Nugteren 615a7fdc81 Fixes some compilation issues related to the database structure change 2017-06-21 23:07:47 +02:00
Cedric Nugteren e44feb8576 Changed the structure of the database to reduce compilation time and save memory 2017-06-20 21:19:26 +02:00
Cedric Nugteren 48f2682eb7 Added tuning results for the Core i7-920 CPU 2017-06-18 20:53:59 +02:00
Cedric Nugteren 3070b502b5 Fixed an overflow bug on 32-bit systems when chosing a GEMM kernel 2017-06-18 20:51:11 +02:00
Cedric Nugteren 33ed1e5a06 Added tuning results for GeForce GT 650M (thanks to bzcheeseman) 2017-06-01 22:52:08 +02:00
Cedric Nugteren f57e209aab Merge pull request #158 from CNugteren/msvc_compilation_fixes
MSVC compilation fixes
2017-05-27 17:53:30 +02:00
Cedric Nugteren 4e04008729 Update to AppVeyor because of changed Khronos repository (9) 2017-05-27 17:39:36 +02:00
Cedric Nugteren 7827cfbe4a Update to AppVeyor because of changed Khronos repository (8) 2017-05-27 17:33:47 +02:00
Cedric Nugteren 9ae6f174d9 Update to AppVeyor because of changed Khronos repository (7) 2017-05-27 17:30:30 +02:00
Cedric Nugteren bb37bd0814 Update to AppVeyor because of changed Khronos repository (6) 2017-05-27 17:17:10 +02:00
Cedric Nugteren 53d739129e Update to AppVeyor because of changed Khronos repository (5) 2017-05-27 17:11:22 +02:00
Cedric Nugteren f7a822110c Update to AppVeyor because of changed Khronos repository (4) 2017-05-27 17:06:09 +02:00
Cedric Nugteren 3bca9f85d2 Update to AppVeyor because of changed Khronos repository (3) 2017-05-27 17:01:11 +02:00
Cedric Nugteren 70188686f2 Merge pull request #157 from kpot/improved_caching
Fixes inability to run GEMM on multiple identical GPUs (issue #155)
2017-05-27 09:47:25 +02:00
Kirill Mavreshko 64ba590279 Fixed comment decribing the order of program cache fields 2017-05-27 10:30:09 +05:00
Cedric Nugteren 01de4b5413 Update to AppVeyor because of changed Khronos repository (2) 2017-05-26 22:20:04 +02:00
Cedric Nugteren e8b6f01e04 Update to AppVeyor because of changed Khronos repository 2017-05-26 22:12:02 +02:00
Cedric Nugteren f7a16d427c Fixed a compilation issue under MSVC 2013 2017-05-26 22:10:56 +02:00
Kirill Mavreshko 628e1e8cce Fixes inability to run GEMM on multiple identical GPUs (issue #155) 2017-05-26 15:04:19 +05:00
Cedric Nugteren 9c703a6021 Merge pull request #156 from ctuning/master
changing "wb" to "w" when saving json file (text mode)
2017-05-24 20:18:41 +02:00
Grigori Fursin 35e2e6c3a4 changing "wb" to "w" when saving json file (text mode) - compatibility for Python 3 2017-05-24 15:08:34 +02:00
Cedric Nugteren 953a5a9c22 Fixed a minor compilation issue of a sample with GCC 4.8 2017-05-15 22:14:17 +02:00
Cedric Nugteren 8400ee3a09 Fixed an TRSM issue caused by incorrect block size calculation 2017-05-15 22:04:55 +02:00
Cedric Nugteren 512b83dbad Fixed a missing synchronization barrier in the invert kernel; fixes TRSM tests 2017-05-14 20:27:35 +02:00
Cedric Nugteren f151e56daa Added the IxAMIN routines: absolute minimum version of IxAMAX 2017-05-12 20:01:33 -07:00
Cedric Nugteren 86e8df60f1 Fixed a bug in the TRSM routine; tests now pass 2017-05-12 17:43:56 -07:00
Cedric Nugteren 81d9ed3946 Removed the included performance reports; README now redirects to the new external website 2017-05-12 13:18:10 -07:00
Cedric Nugteren 71933c3411 Added tuning results for the AMD Radeon Fiji GPU 2017-05-11 22:53:52 -07:00
Cedric Nugteren d67455fdb8 Fixes the build-status table in the README 2017-05-11 22:22:10 -07:00
Cedric Nugteren 93c8db7fe7 Bug-fix in the half-precision test of the amax routine 2017-05-11 22:19:15 -07:00
Cedric Nugteren 1df28a15fc Re-added random tuning for GEMM after accidental removal 2017-05-11 22:12:38 -07:00
Cedric Nugteren 97955fc221 Minor naming fixes to the benchmark script 2017-05-11 22:12:16 -07:00
Cedric Nugteren 81f598eceb Merge branch 'master_is_neww_devel_branch' 2017-05-11 21:41:18 -07:00
Cedric Nugteren b0f3659121 The master branch is now the main 'development' branch 2017-05-03 19:49:15 +02:00
Cedric Nugteren 606f2871dd Merge pull request #150 from CNugteren/development
Update to version 0.11.0
2017-05-02 22:39:50 +02:00
Cedric Nugteren e9d2a2f54c Updated to version 0.11.0 2017-05-02 20:29:59 +02:00
Cedric Nugteren c9f39ed13a Merge pull request #148 from CNugteren/benchmarking
Various updates related to benchmarking
2017-04-23 18:29:59 +02:00
Cedric Nugteren 67d4bbff66 Added an option to the database script to remove tuning results from the database 2017-04-23 17:59:16 +02:00
Cedric Nugteren 1c33af6eab Re-added Titan X (Pascal) tuning results based on more averaging when tuning 2017-04-23 17:58:56 +02:00
Cedric Nugteren 049d0fc95a Fixed a compiler warning message 2017-04-23 10:45:08 +02:00
Cedric Nugteren 3eea8dc998 Increased the default number of runs for the tuner from 2 up to 10 for fast kernels 2017-04-22 13:56:07 +02:00
Cedric Nugteren 192199c9cb Fixed the direct vs indirect setting for NVIDIA GPUs 2017-04-22 13:43:27 +02:00
Cedric Nugteren e41d204856 Increased the default number of runs for GEMV tuning; updated GEMV tuning results for Iris Pro 2017-04-21 22:12:20 +02:00
Cedric Nugteren 957aaae6ca Merge branch 'development' into benchmarking 2017-04-21 21:59:48 +02:00
Cedric Nugteren cc9ad7b33b Removed the words SUMMARY from the title of the benchmark script when benchmarking the summary 2017-04-21 21:34:44 +02:00
Cedric Nugteren 4d34083039 Updated the settings for the batched benchmarks 2017-04-20 22:19:29 +02:00