Cedric Nugteren
|
d929525039
|
Added support for the convgemm tuner in the tuner database
|
2018-12-31 18:49:12 +01:00 |
|
Cedric Nugteren
|
f14e6f87d2
|
Updated tuning results for the Skylake ULT GT2 GPU with the new kernel
|
2018-04-15 11:45:45 +02:00 |
|
Cedric Nugteren
|
f6a48f05ed
|
Made it possible to add tuning parameters to the database using the script
|
2018-04-10 21:24:36 +02:00 |
|
Cedric Nugteren
|
3fbbb81137
|
Fixed a bug in the compression part of the database script
|
2018-04-10 21:18:11 +02:00 |
|
Cedric Nugteren
|
77ba11f686
|
Extended the maximum number of tuning parameters from 14 to 16
|
2018-04-08 18:12:54 +02:00 |
|
Cedric Nugteren
|
cf7965dc68
|
Fixed a python3 import error issue with the database script
|
2018-04-07 17:40:43 +02:00 |
|
Cedric Nugteren
|
1e738db6dd
|
Split the database into multiple small compilation units
|
2017-12-27 12:04:22 +01:00 |
|
Cedric Nugteren
|
0ee81e27b9
|
Added tuning results for Apple AMD Radeon Pro 580
|
2017-12-20 19:59:31 +01:00 |
|
Cedric Nugteren
|
c680666250
|
Added try-except to database script parser to skip invalid files
|
2017-12-20 19:14:04 +01:00 |
|
Cedric Nugteren
|
606990af6f
|
Made the database script properly handle multiple entries for a single device
|
2017-11-20 21:38:23 +01:00 |
|
Cedric Nugteren
|
defad3d1a2
|
Minor fix to the database script
|
2017-11-19 18:19:21 +01:00 |
|
Cedric Nugteren
|
a3a8b44f59
|
Some fixed for the new auto-tuner to be compatible with the Python scripts
|
2017-11-19 16:31:08 +01:00 |
|
Cedric Nugteren
|
33ac2b0175
|
Improved the way the database defaults are computed
|
2017-11-06 21:59:45 +01:00 |
|
Cedric Nugteren
|
9b0a435fb0
|
Integrated the GEMM routine tuner for kernel selection; added first tuning results
|
2017-11-02 21:47:14 +01:00 |
|
Cedric Nugteren
|
73272ab97d
|
Fixed a bug in database compression/decompression
|
2017-11-02 21:19:18 +01:00 |
|
Cedric Nugteren
|
4e317f5e85
|
Improved compilation time of the tuner database
|
2017-09-16 18:02:37 +02:00 |
|
Cedric Nugteren
|
0d13d814c2
|
Added architecture layer in the tuning database for better performance on unseen devices
|
2017-09-14 21:27:33 +02:00 |
|
Cedric Nugteren
|
14a61d2425
|
Added database compress and de-compress functions
|
2017-09-12 22:25:52 +02:00 |
|
Cedric Nugteren
|
ebe10d5118
|
Database now works with new format of clblast_[property]
|
2017-09-11 20:40:37 +02:00 |
|
Cedric Nugteren
|
20da5e33a8
|
Split the database files over multiple directories and files; first step towards separate compilation
|
2017-09-06 21:50:42 +02:00 |
|
Cedric Nugteren
|
615a7fdc81
|
Fixes some compilation issues related to the database structure change
|
2017-06-21 23:07:47 +02:00 |
|
Cedric Nugteren
|
e44feb8576
|
Changed the structure of the database to reduce compilation time and save memory
|
2017-06-20 21:19:26 +02:00 |
|
Grigori Fursin
|
35e2e6c3a4
|
changing "wb" to "w" when saving json file (text mode) - compatibility for Python 3
|
2017-05-24 15:08:34 +02:00 |
|
Cedric Nugteren
|
67d4bbff66
|
Added an option to the database script to remove tuning results from the database
|
2017-04-23 17:59:16 +02:00 |
|
Cedric Nugteren
|
1c33af6eab
|
Re-added Titan X (Pascal) tuning results based on more averaging when tuning
|
2017-04-23 17:58:56 +02:00 |
|
Cedric Nugteren
|
3ec14df60e
|
Added proper handling of mismatched arguments in the database script
|
2017-04-17 15:00:45 +02:00 |
|
Cedric Nugteren
|
32b850b12b
|
Added tuning results for the AMD Turks GPU and the Intel Core i7-2670QM CPU
|
2017-01-03 20:30:56 +01:00 |
|
Cedric Nugteren
|
080e1be684
|
Improved the default parameters for cases with non-common parameters across all devices
|
2016-11-26 16:38:17 +01:00 |
|
Cedric Nugteren
|
0f9311d46a
|
Fixed an issue with a growing database: the database is now a global variable in a namespace and its container uses const-pointers to the actual data
|
2016-10-14 20:56:32 +02:00 |
|
Cedric Nugteren
|
39afc9543b
|
Changed the storage location of the database to a separate Github repository
|
2016-10-10 19:10:12 +02:00 |
|
Cedric Nugteren
|
aa3dffe356
|
Added XgemvFastRot and Xgemm 16-bit tuning results: just defaults which are now automatically taken from 32-bit if there are no entries at all
|
2016-09-12 20:13:38 +02:00 |
|
Cedric Nugteren
|
b5a67f86ec
|
Complete re-write of the database script. Changed Pandas for the much faster and convienient plain JSON/dict data-type
|
2016-09-11 21:29:28 +02:00 |
|
Cedric Nugteren
|
e21f32bc99
|
Updated database based on exhaustive tuning results for GEMM for the R9 M370X GPU
|
2016-09-10 14:00:43 +02:00 |
|
Cedric Nugteren
|
3daba70997
|
Updated the database script to remove duplicate entries: keeps only the best-performing cases for a specific parameters combination
|
2016-09-10 11:12:09 +02:00 |
|
Cedric Nugteren
|
521bf6cdfc
|
Added tuning results for Intel Broadwell 5500 GT2 GPU
|
2016-09-03 16:43:23 +02:00 |
|
Cedric Nugteren
|
19574b2519
|
Updated tuning results for Haswell GT2 Mobile GPU; fixed database script to handle duplicate entries of different runs
|
2016-09-03 12:45:11 +02:00 |
|
Cedric Nugteren
|
0c0f0ac7f9
|
Also changed the default-default for unknown device types to use the same method as for known device groups
|
2016-08-21 20:35:20 +02:00 |
|
Cedric Nugteren
|
00979faab4
|
Updated the changelog; refactored the database-get-bests code a bit
|
2016-08-21 20:16:06 +02:00 |
|
Cedric Nugteren
|
7d5631b7e4
|
Updated the database script to calculate the relative best performance of tuning results common for a device/vendor type
|
2016-08-15 21:01:07 +02:00 |
|
Cedric Nugteren
|
7da6492b36
|
Improved the speed of the new common-best defaults method for the database generation
|
2016-08-09 21:06:04 +02:00 |
|
Cedric Nugteren
|
3f5401d4c8
|
Added a first version of the database's common-best default calculation
|
2016-08-07 16:25:38 +02:00 |
|
Cedric Nugteren
|
2582f0290a
|
Moved the XgemvFast and XgemvFastRot tuning database into a separate file
|
2016-07-25 22:43:49 +02:00 |
|
Cedric Nugteren
|
622682ffe3
|
Refactored the Python database script: separated functionality in modules, now complies to the PEP8 style, added proper command-line argument parsing, and cleaned-up
|
2016-07-24 16:41:01 +02:00 |
|
Cedric Nugteren
|
9683b50c55
|
Added tuning results for GTX670, GTX750, and GTX1070 (thanks to gcp)
|
2016-07-03 20:30:47 +02:00 |
|
Cedric Nugteren
|
5a690f4e36
|
Prints the current pandas version and reports the minimum required version
|
2016-07-02 16:44:13 +02:00 |
|
Cedric Nugteren
|
eab8d3cda1
|
Minor fix to the database script
|
2016-06-19 14:55:17 +02:00 |
|
Cedric Nugteren
|
f726fbdc9f
|
Moved all headers into the source tree, changed headers to .hpp extension
|
2016-06-18 20:20:13 +02:00 |
|
Cedric Nugteren
|
489c5d76cf
|
Merged in latest changes from 0.7.1 release
|
2016-05-18 21:32:56 +02:00 |
|
Cedric Nugteren
|
120c31a30f
|
Initial experimental version of the half-precision HAXPY routine
|
2016-05-13 20:49:34 +02:00 |
|
Cedric Nugteren
|
27d0ac7f38
|
Added tuning results for AMD Pitcairn (R9 270X)
|
2016-05-01 19:33:50 +02:00 |
|