Commit graph

1001 commits

Author SHA1 Message Date
Cedric Nugteren c151ab1325 Refactored the tuning architecture: less duplicate now; more defaults 2017-09-30 20:26:26 +02:00
Cedric Nugteren ef082bba0d Fixed a minor appveyor artifact issue 2017-09-30 17:33:37 +02:00
Cedric Nugteren f4c4674cf6 Updated to version 1.1.0 2017-09-30 17:19:17 +02:00
Cedric Nugteren 2949e156f5 Added notes for Android compilation of CLBlast 2017-09-26 21:23:53 +02:00
Cedric Nugteren 00b5771477 Added Android header for compilation with gnustl STL 2017-09-26 21:20:01 +02:00
Cedric Nugteren 21af690472 Added missing headers 2017-09-26 21:17:55 +02:00
Cedric Nugteren ed980a1df1 Updated database override function to work with the new database storage format 2017-09-24 15:44:14 +02:00
Cedric Nugteren 255f09843c Made program and binary databases dependent on the routine parameters on top of the name 2017-09-23 20:40:38 +02:00
Cedric Nugteren 0d8313708c Merge branch 'device_name_slow_on_nvidia_gpu' 2017-09-23 18:12:13 +02:00
Cedric Nugteren 2df9f21ab8 Added extra benchmarks to verify new database caching keys performance 2017-09-23 18:06:43 +02:00
Cedric Nugteren 890281f3e8 Made database-caching no longer dependent on device name but on device/platform IDs 2017-09-23 17:50:44 +02:00
Cedric Nugteren 0dd2ca9283 Merge pull request #192 from CNugteren/diagnostics_helper
Diagnostics helper
2017-09-23 11:43:19 +02:00
Cedric Nugteren 65c492edf6 Added OpenCL properties printing to the diagnostics helper 2017-09-22 21:35:32 +02:00
Cedric Nugteren 2ef6578961 Added first version of a small CLBlast diagnostics helper 2017-09-19 21:43:35 +02:00
Cedric Nugteren 44b59ec0cb Merge branch 'msvc2013_fixes' 2017-09-19 19:54:43 +02:00
Cedric Nugteren ae1eeb4d1f Fixed type conversion warnings under MSVC 2013 2017-09-19 19:44:34 +02:00
Cedric Nugteren 1d2ee29cb9 Fixed compilation issues of the database for MSVC 2013 2017-09-19 19:44:05 +02:00
Cedric Nugteren a23cd8d13a Updated README with proper AMD device names; fixed device look-up for names of length 50+ 2017-09-16 21:26:38 +02:00
Cedric Nugteren 0802e3d84c Added tuning results for Intel Core i7 6770HQ 2017-09-16 21:19:06 +02:00
Cedric Nugteren 7d0ef8e10d Merge pull request #191 from CNugteren/database_improvements
Database improvements
2017-09-16 20:37:09 +02:00
Cedric Nugteren bcf39eb79a Fixed a compilation error and warning under MacOS 2017-09-16 18:34:11 +02:00
Cedric Nugteren 163474e171 Fixed an issue with the NVIDIA compute capability not being retrieved properly 2017-09-16 18:25:23 +02:00
Cedric Nugteren 4e317f5e85 Improved compilation time of the tuner database 2017-09-16 18:02:37 +02:00
Cedric Nugteren c21878ecce Added a guard against missing AMD and NVIDIA extensions 2017-09-14 21:58:08 +02:00
Cedric Nugteren 0d13d814c2 Added architecture layer in the tuning database for better performance on unseen devices 2017-09-14 21:27:33 +02:00
Cedric Nugteren 14a61d2425 Added database compress and de-compress functions 2017-09-12 22:25:52 +02:00
Cedric Nugteren ebe10d5118 Database now works with new format of clblast_[property] 2017-09-11 20:40:37 +02:00
Cedric Nugteren 76382ff6c1 Added the new vendor-architecture-name hierarchy to the tuners as well 2017-09-10 16:34:54 +02:00
Cedric Nugteren 91ea7fcde2 Introduced the notion of a device-architecture for the database and added device and architecture name mappings 2017-09-08 21:09:05 +02:00
Cedric Nugteren 20da5e33a8 Split the database files over multiple directories and files; first step towards separate compilation 2017-09-06 21:50:42 +02:00
Cedric Nugteren bb947890de Merge branch 'im2col_bugfix' 2017-09-05 20:08:00 +02:00
Cedric Nugteren 8905da259d Fixed a modulo and division issue manifesting on Apple OpenCL for im2col 2017-09-05 18:49:23 +02:00
Cedric Nugteren 28462aa050 Removed an assumption that the 'default' tuning parameters have to be stored last; this is no longer needed 2017-09-04 17:39:57 +02:00
Cedric Nugteren 297159d5b9 Fixed a bug in im2col: process only valid channel IDs 2017-08-31 21:58:12 +02:00
Cedric Nugteren 6194d43efb Fixed a bug in im2col confusing first and second workgroup size; made im2col kernel 2d instead of 3d 2017-08-31 20:34:10 +02:00
Cedric Nugteren 54e160cd88 Fixed some things in the tuner: bugs, style, and defaults to random search 2017-08-31 20:28:01 +02:00
Cedric Nugteren 6e95752054 Merge pull request #184 from CNugteren/im_to_col
im2col
2017-08-30 19:17:17 +02:00
Cedric Nugteren 161fd8514d Merge branch 'master' into im_to_col 2017-08-24 21:15:14 +02:00
Cedric Nugteren 4d9d03ba51 Completed im2col implementation 2017-08-24 21:11:12 +02:00
Cedric Nugteren a8c26594d9 Made the im2col client properly handle the arguments 2017-08-23 19:54:09 +02:00
Cedric Nugteren da28cc5e93 Minor updates after merging in the PSO addition to the tuners 2017-08-21 20:14:02 +02:00
Cedric Nugteren e5eb6b1d3a Merge pull request #173 from mcian/PSO_params
Add PSO parameters support and search strategy selection from command…
2017-08-21 20:06:29 +02:00
mcian dfd332524a Remove multistrategy and related functions 2017-08-21 14:09:11 +02:00
Cedric Nugteren 803ca781f9 First version of im2col kernel, unoptimized but working 2017-08-19 18:25:13 +02:00
Cedric Nugteren 132e62892d Implemented proper im2col reference function and completd tests 2017-08-19 16:55:09 +02:00
Cedric Nugteren 777681dcbd Merge branch 'master' into im_to_col 2017-08-12 20:50:00 +02:00
Cedric Nugteren d67fd6604b Merge pull request #182 from CNugteren/compilation_improvements
Compilation improvements
2017-08-12 17:17:10 +02:00
Cedric Nugteren d30c459c5f Fixed .hpp -> .h typo in CMakeLists 2017-08-12 16:11:23 +02:00
Cedric Nugteren f6b6d7ef4b Properly set the common test utilities in the CMake files 2017-08-12 16:07:28 +02:00
Cedric Nugteren 0a63621579 Moved functions from the header to the .cpp file to prevent compiling the same code multiple times 2017-08-12 15:59:14 +02:00