Cedric Nugteren
|
4e317f5e85
|
Improved compilation time of the tuner database
|
2017-09-16 18:02:37 +02:00 |
|
Cedric Nugteren
|
0d13d814c2
|
Added architecture layer in the tuning database for better performance on unseen devices
|
2017-09-14 21:27:33 +02:00 |
|
Cedric Nugteren
|
14a61d2425
|
Added database compress and de-compress functions
|
2017-09-12 22:25:52 +02:00 |
|
Cedric Nugteren
|
ebe10d5118
|
Database now works with new format of clblast_[property]
|
2017-09-11 20:40:37 +02:00 |
|
Cedric Nugteren
|
20da5e33a8
|
Split the database files over multiple directories and files; first step towards separate compilation
|
2017-09-06 21:50:42 +02:00 |
|
Cedric Nugteren
|
615a7fdc81
|
Fixes some compilation issues related to the database structure change
|
2017-06-21 23:07:47 +02:00 |
|
Cedric Nugteren
|
e44feb8576
|
Changed the structure of the database to reduce compilation time and save memory
|
2017-06-20 21:19:26 +02:00 |
|
Grigori Fursin
|
35e2e6c3a4
|
changing "wb" to "w" when saving json file (text mode) - compatibility for Python 3
|
2017-05-24 15:08:34 +02:00 |
|
Cedric Nugteren
|
67d4bbff66
|
Added an option to the database script to remove tuning results from the database
|
2017-04-23 17:59:16 +02:00 |
|
Cedric Nugteren
|
1c33af6eab
|
Re-added Titan X (Pascal) tuning results based on more averaging when tuning
|
2017-04-23 17:58:56 +02:00 |
|
Cedric Nugteren
|
3ec14df60e
|
Added proper handling of mismatched arguments in the database script
|
2017-04-17 15:00:45 +02:00 |
|
Cedric Nugteren
|
32b850b12b
|
Added tuning results for the AMD Turks GPU and the Intel Core i7-2670QM CPU
|
2017-01-03 20:30:56 +01:00 |
|
Cedric Nugteren
|
080e1be684
|
Improved the default parameters for cases with non-common parameters across all devices
|
2016-11-26 16:38:17 +01:00 |
|
Cedric Nugteren
|
0f9311d46a
|
Fixed an issue with a growing database: the database is now a global variable in a namespace and its container uses const-pointers to the actual data
|
2016-10-14 20:56:32 +02:00 |
|
Cedric Nugteren
|
39afc9543b
|
Changed the storage location of the database to a separate Github repository
|
2016-10-10 19:10:12 +02:00 |
|
Cedric Nugteren
|
aa3dffe356
|
Added XgemvFastRot and Xgemm 16-bit tuning results: just defaults which are now automatically taken from 32-bit if there are no entries at all
|
2016-09-12 20:13:38 +02:00 |
|
Cedric Nugteren
|
b5a67f86ec
|
Complete re-write of the database script. Changed Pandas for the much faster and convienient plain JSON/dict data-type
|
2016-09-11 21:29:28 +02:00 |
|
Cedric Nugteren
|
e21f32bc99
|
Updated database based on exhaustive tuning results for GEMM for the R9 M370X GPU
|
2016-09-10 14:00:43 +02:00 |
|
Cedric Nugteren
|
3daba70997
|
Updated the database script to remove duplicate entries: keeps only the best-performing cases for a specific parameters combination
|
2016-09-10 11:12:09 +02:00 |
|
Cedric Nugteren
|
521bf6cdfc
|
Added tuning results for Intel Broadwell 5500 GT2 GPU
|
2016-09-03 16:43:23 +02:00 |
|
Cedric Nugteren
|
19574b2519
|
Updated tuning results for Haswell GT2 Mobile GPU; fixed database script to handle duplicate entries of different runs
|
2016-09-03 12:45:11 +02:00 |
|
Cedric Nugteren
|
0c0f0ac7f9
|
Also changed the default-default for unknown device types to use the same method as for known device groups
|
2016-08-21 20:35:20 +02:00 |
|
Cedric Nugteren
|
00979faab4
|
Updated the changelog; refactored the database-get-bests code a bit
|
2016-08-21 20:16:06 +02:00 |
|
Cedric Nugteren
|
7d5631b7e4
|
Updated the database script to calculate the relative best performance of tuning results common for a device/vendor type
|
2016-08-15 21:01:07 +02:00 |
|
Cedric Nugteren
|
7da6492b36
|
Improved the speed of the new common-best defaults method for the database generation
|
2016-08-09 21:06:04 +02:00 |
|
Cedric Nugteren
|
3f5401d4c8
|
Added a first version of the database's common-best default calculation
|
2016-08-07 16:25:38 +02:00 |
|
Cedric Nugteren
|
2582f0290a
|
Moved the XgemvFast and XgemvFastRot tuning database into a separate file
|
2016-07-25 22:43:49 +02:00 |
|
Cedric Nugteren
|
622682ffe3
|
Refactored the Python database script: separated functionality in modules, now complies to the PEP8 style, added proper command-line argument parsing, and cleaned-up
|
2016-07-24 16:41:01 +02:00 |
|
Cedric Nugteren
|
9683b50c55
|
Added tuning results for GTX670, GTX750, and GTX1070 (thanks to gcp)
|
2016-07-03 20:30:47 +02:00 |
|
Cedric Nugteren
|
5a690f4e36
|
Prints the current pandas version and reports the minimum required version
|
2016-07-02 16:44:13 +02:00 |
|
Cedric Nugteren
|
eab8d3cda1
|
Minor fix to the database script
|
2016-06-19 14:55:17 +02:00 |
|
Cedric Nugteren
|
f726fbdc9f
|
Moved all headers into the source tree, changed headers to .hpp extension
|
2016-06-18 20:20:13 +02:00 |
|
Cedric Nugteren
|
489c5d76cf
|
Merged in latest changes from 0.7.1 release
|
2016-05-18 21:32:56 +02:00 |
|
Cedric Nugteren
|
120c31a30f
|
Initial experimental version of the half-precision HAXPY routine
|
2016-05-13 20:49:34 +02:00 |
|
Cedric Nugteren
|
27d0ac7f38
|
Added tuning results for AMD Pitcairn (R9 270X)
|
2016-05-01 19:33:50 +02:00 |
|
Cedric Nugteren
|
c94b628318
|
Updated tuning database for reduction/dot kernels based on the new tuner; partially repopulated the database
|
2016-05-01 19:17:04 +02:00 |
|
cnugteren
|
a61724ece5
|
Fixed the way the defaults are calculated in the database; added warning for non-matching tuner arguments
|
2016-04-11 22:27:44 -06:00 |
|
Cedric Nugteren
|
fa79720557
|
Added tuning results for Intel Iris Pro and AMD R9 M370X
|
2016-02-28 16:47:52 +01:00 |
|
Cedric Nugteren
|
8854a73127
|
Added XGER routine, kernel, and tuner
|
2016-02-20 12:40:01 +01:00 |
|
Cedric Nugteren
|
165a94c200
|
Various fixes to the database script
|
2016-02-07 16:39:37 +01:00 |
|
Cedric Nugteren
|
00be6f7530
|
Added dictionary with short and long OpenCL vendor names to fix issues with Intel having multiple names
|
2016-02-07 11:59:30 +01:00 |
|
Cedric Nugteren
|
c76f1d9dbb
|
Made the tuning database an optional external download
|
2016-02-07 10:59:51 +01:00 |
|
CNugteren
|
704a729f5c
|
Made the database script compatible with Python 3
|
2016-02-06 13:11:36 +01:00 |
|
Cedric Nugteren
|
276e772a2c
|
Added first auto-generated database headers from the Python database; only K40 and Iris supported now
|
2016-01-30 11:43:21 +01:00 |
|
Cedric Nugteren
|
76c9148030
|
Minor improvements to the database script, including proper file paths
|
2016-01-24 17:56:27 +01:00 |
|
Cedric Nugteren
|
f0b3091cdb
|
Added Python function to compute defaults for a particular device/vendor combination
|
2016-01-24 17:35:31 +01:00 |
|
CNugteren
|
09c94b17cf
|
Added tuning data for Tesla K40
|
2015-10-28 21:20:42 +01:00 |
|
CNugteren
|
bb4e78f737
|
Added initial tuning database with Intel Iris data
|
2015-10-25 16:49:59 +01:00 |
|
CNugteren
|
ccd1a5c7cc
|
Updated tuning database script according to the new JSON format
|
2015-10-25 16:49:29 +01:00 |
|
CNugteren
|
a2d5d7770e
|
Moved the tuner database script to a separate folder
|
2015-10-25 16:27:14 +01:00 |
|