Cedric Nugteren
|
0e1a152023
|
First version of the tuning API, added interface for copy-kernel, added sample
|
2018-03-06 20:52:12 +01:00 |
|
Cedric Nugteren
|
19fd263fb2
|
Moved some constants from global scope to a function; removed unnecessary includes
|
2018-01-25 20:00:43 +01:00 |
|
Cedric Nugteren
|
249bdaa8e9
|
Reformatted tuning code to make compilation faster
|
2017-12-18 21:34:07 +01:00 |
|
Cedric Nugteren
|
c2f08fa346
|
Fixed an issue in the tuners to prevent error -14 from persisting (CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST)
|
2017-12-10 14:48:13 +01:00 |
|
Cedric Nugteren
|
ca5dbcd2bd
|
Made the pre-processor run by default for ARM and Qualcomm GPUs
|
2017-12-09 15:16:53 +01:00 |
|
Cedric Nugteren
|
13eb772343
|
Integrated pre-processor in compilation flow, default is still disabled
|
2017-11-30 21:32:47 +01:00 |
|
Cedric Nugteren
|
5467c0cac5
|
Fixed a variety of warnings and an error for MSVC2013 compilation
|
2017-11-19 21:09:24 +01:00 |
|
Cedric Nugteren
|
4e0d08c3bc
|
Added compilation timing and better compilation error reporting
|
2017-11-19 16:58:13 +01:00 |
|
Cedric Nugteren
|
a3a8b44f59
|
Some fixed for the new auto-tuner to be compatible with the Python scripts
|
2017-11-19 16:31:08 +01:00 |
|
Cedric Nugteren
|
76d2b7f0b6
|
Revived the GEMM routine tuner; minor formatting changes
|
2017-11-19 12:59:52 +01:00 |
|
Cedric Nugteren
|
8a5a5e031e
|
Moved some tuning functions from .hpp to .cpp
|
2017-11-17 20:58:36 +01:00 |
|
Cedric Nugteren
|
f94d498a37
|
Moved compilation function to separate file; removed dependency of tuners of the CLBlast library
|
2017-11-17 20:57:46 +01:00 |
|
Cedric Nugteren
|
2b8ad70b63
|
Added printing of the best parameters for the new tuner
|
2017-11-16 21:18:29 +01:00 |
|
Cedric Nugteren
|
1b2b46f2f0
|
Added first version of integrated and re-written auto-tuner
|
2017-11-15 22:49:35 +01:00 |
|
Cedric Nugteren
|
c151ab1325
|
Refactored the tuning architecture: less duplicate now; more defaults
|
2017-09-30 20:26:26 +02:00 |
|
Cedric Nugteren
|
76382ff6c1
|
Added the new vendor-architecture-name hierarchy to the tuners as well
|
2017-09-10 16:34:54 +02:00 |
|
Cedric Nugteren
|
54e160cd88
|
Fixed some things in the tuner: bugs, style, and defaults to random search
|
2017-08-31 20:28:01 +02:00 |
|
mcian
|
dfd332524a
|
Remove multistrategy and related functions
|
2017-08-21 14:09:11 +02:00 |
|
mcian
|
473e814718
|
Code refactoring
|
2017-07-23 14:48:13 +02:00 |
|
mcian
|
8131e68664
|
Add PSO parameters support and search strategy selection from command line
|
2017-07-17 12:00:25 +02:00 |
|
Cedric Nugteren
|
11bb30e72b
|
Added the possibility to tune batched kernels
|
2017-03-14 20:29:51 +01:00 |
|
Cedric Nugteren
|
7f14b11f1e
|
Changed the way the test-data is generated: now using a single MT generator and distribution for all data
|
2017-03-05 11:13:47 +01:00 |
|
Cedric Nugteren
|
39c49bf4f9
|
Made it possible to use the command-line environmental variables for each executable and without re-running CMake
|
2016-11-27 11:00:29 +01:00 |
|
Cedric Nugteren
|
b0ff11acf0
|
Moved files around a bit; created a utilities subfolder
|
2016-10-22 15:36:48 +02:00 |
|
Cedric Nugteren
|
ecc704cc76
|
Added default num-runs to the tuner adding averaging over 10 runs as a default for the GEMM direct kernel
|
2016-10-01 16:55:21 +02:00 |
|
Cedric Nugteren
|
d59e5c570b
|
Added an option to run tuned kernels multiple times to average execution times; requires CLTune 2.5.0
|
2016-09-27 21:03:24 +02:00 |
|
Cedric Nugteren
|
6178fcd584
|
Now generates test/client/tuner data using a fixed seed to enable reproducability of results
|
2016-09-27 19:55:21 +02:00 |
|
Cedric Nugteren
|
f726fbdc9f
|
Moved all headers into the source tree, changed headers to .hpp extension
|
2016-06-18 20:20:13 +02:00 |
|