Commit graph

65 commits

Author SHA1 Message Date
Cedric Nugteren d929525039 Added support for the convgemm tuner in the tuner database 2018-12-31 18:49:12 +01:00
Cedric Nugteren f14e6f87d2 Updated tuning results for the Skylake ULT GT2 GPU with the new kernel 2018-04-15 11:45:45 +02:00
Cedric Nugteren f6a48f05ed Made it possible to add tuning parameters to the database using the script 2018-04-10 21:24:36 +02:00
Cedric Nugteren 3fbbb81137 Fixed a bug in the compression part of the database script 2018-04-10 21:18:11 +02:00
Cedric Nugteren 77ba11f686 Extended the maximum number of tuning parameters from 14 to 16 2018-04-08 18:12:54 +02:00
Cedric Nugteren cf7965dc68 Fixed a python3 import error issue with the database script 2018-04-07 17:40:43 +02:00
Cedric Nugteren 1e738db6dd Split the database into multiple small compilation units 2017-12-27 12:04:22 +01:00
Cedric Nugteren 0ee81e27b9 Added tuning results for Apple AMD Radeon Pro 580 2017-12-20 19:59:31 +01:00
Cedric Nugteren c680666250 Added try-except to database script parser to skip invalid files 2017-12-20 19:14:04 +01:00
Cedric Nugteren 606990af6f Made the database script properly handle multiple entries for a single device 2017-11-20 21:38:23 +01:00
Cedric Nugteren defad3d1a2 Minor fix to the database script 2017-11-19 18:19:21 +01:00
Cedric Nugteren a3a8b44f59 Some fixed for the new auto-tuner to be compatible with the Python scripts 2017-11-19 16:31:08 +01:00
Cedric Nugteren 33ac2b0175 Improved the way the database defaults are computed 2017-11-06 21:59:45 +01:00
Cedric Nugteren 9b0a435fb0 Integrated the GEMM routine tuner for kernel selection; added first tuning results 2017-11-02 21:47:14 +01:00
Cedric Nugteren 73272ab97d Fixed a bug in database compression/decompression 2017-11-02 21:19:18 +01:00
Cedric Nugteren 4e317f5e85 Improved compilation time of the tuner database 2017-09-16 18:02:37 +02:00
Cedric Nugteren 0d13d814c2 Added architecture layer in the tuning database for better performance on unseen devices 2017-09-14 21:27:33 +02:00
Cedric Nugteren 14a61d2425 Added database compress and de-compress functions 2017-09-12 22:25:52 +02:00
Cedric Nugteren ebe10d5118 Database now works with new format of clblast_[property] 2017-09-11 20:40:37 +02:00
Cedric Nugteren 20da5e33a8 Split the database files over multiple directories and files; first step towards separate compilation 2017-09-06 21:50:42 +02:00
Cedric Nugteren 615a7fdc81 Fixes some compilation issues related to the database structure change 2017-06-21 23:07:47 +02:00
Cedric Nugteren e44feb8576 Changed the structure of the database to reduce compilation time and save memory 2017-06-20 21:19:26 +02:00
Grigori Fursin 35e2e6c3a4 changing "wb" to "w" when saving json file (text mode) - compatibility for Python 3 2017-05-24 15:08:34 +02:00
Cedric Nugteren 67d4bbff66 Added an option to the database script to remove tuning results from the database 2017-04-23 17:59:16 +02:00
Cedric Nugteren 1c33af6eab Re-added Titan X (Pascal) tuning results based on more averaging when tuning 2017-04-23 17:58:56 +02:00
Cedric Nugteren 3ec14df60e Added proper handling of mismatched arguments in the database script 2017-04-17 15:00:45 +02:00
Cedric Nugteren 32b850b12b Added tuning results for the AMD Turks GPU and the Intel Core i7-2670QM CPU 2017-01-03 20:30:56 +01:00
Cedric Nugteren 080e1be684 Improved the default parameters for cases with non-common parameters across all devices 2016-11-26 16:38:17 +01:00
Cedric Nugteren 0f9311d46a Fixed an issue with a growing database: the database is now a global variable in a namespace and its container uses const-pointers to the actual data 2016-10-14 20:56:32 +02:00
Cedric Nugteren 39afc9543b Changed the storage location of the database to a separate Github repository 2016-10-10 19:10:12 +02:00
Cedric Nugteren aa3dffe356 Added XgemvFastRot and Xgemm 16-bit tuning results: just defaults which are now automatically taken from 32-bit if there are no entries at all 2016-09-12 20:13:38 +02:00
Cedric Nugteren b5a67f86ec Complete re-write of the database script. Changed Pandas for the much faster and convienient plain JSON/dict data-type 2016-09-11 21:29:28 +02:00
Cedric Nugteren e21f32bc99 Updated database based on exhaustive tuning results for GEMM for the R9 M370X GPU 2016-09-10 14:00:43 +02:00
Cedric Nugteren 3daba70997 Updated the database script to remove duplicate entries: keeps only the best-performing cases for a specific parameters combination 2016-09-10 11:12:09 +02:00
Cedric Nugteren 521bf6cdfc Added tuning results for Intel Broadwell 5500 GT2 GPU 2016-09-03 16:43:23 +02:00
Cedric Nugteren 19574b2519 Updated tuning results for Haswell GT2 Mobile GPU; fixed database script to handle duplicate entries of different runs 2016-09-03 12:45:11 +02:00
Cedric Nugteren 0c0f0ac7f9 Also changed the default-default for unknown device types to use the same method as for known device groups 2016-08-21 20:35:20 +02:00
Cedric Nugteren 00979faab4 Updated the changelog; refactored the database-get-bests code a bit 2016-08-21 20:16:06 +02:00
Cedric Nugteren 7d5631b7e4 Updated the database script to calculate the relative best performance of tuning results common for a device/vendor type 2016-08-15 21:01:07 +02:00
Cedric Nugteren 7da6492b36 Improved the speed of the new common-best defaults method for the database generation 2016-08-09 21:06:04 +02:00
Cedric Nugteren 3f5401d4c8 Added a first version of the database's common-best default calculation 2016-08-07 16:25:38 +02:00
Cedric Nugteren 2582f0290a Moved the XgemvFast and XgemvFastRot tuning database into a separate file 2016-07-25 22:43:49 +02:00
Cedric Nugteren 622682ffe3 Refactored the Python database script: separated functionality in modules, now complies to the PEP8 style, added proper command-line argument parsing, and cleaned-up 2016-07-24 16:41:01 +02:00
Cedric Nugteren 9683b50c55 Added tuning results for GTX670, GTX750, and GTX1070 (thanks to gcp) 2016-07-03 20:30:47 +02:00
Cedric Nugteren 5a690f4e36 Prints the current pandas version and reports the minimum required version 2016-07-02 16:44:13 +02:00
Cedric Nugteren eab8d3cda1 Minor fix to the database script 2016-06-19 14:55:17 +02:00
Cedric Nugteren f726fbdc9f Moved all headers into the source tree, changed headers to .hpp extension 2016-06-18 20:20:13 +02:00
Cedric Nugteren 489c5d76cf Merged in latest changes from 0.7.1 release 2016-05-18 21:32:56 +02:00
Cedric Nugteren 120c31a30f Initial experimental version of the half-precision HAXPY routine 2016-05-13 20:49:34 +02:00
Cedric Nugteren 27d0ac7f38 Added tuning results for AMD Pitcairn (R9 270X) 2016-05-01 19:33:50 +02:00