Commit graph

118 commits

Author SHA1 Message Date
Cedric Nugteren 3519d32ac4 Extended the GEMM tuner to be able to tune the new 'kernel 1' 2018-04-07 17:05:44 +02:00
Cedric Nugteren d86ff75fa5 Added argument checking for the GEMM tuner: expects m/n to be multiples of MWG/NWG 2018-03-30 10:23:33 +02:00
Cedric Nugteren 9fb6550dd0 Added the OpenCL local memory size constraint to the tuners 2018-03-22 21:01:02 +01:00
Cedric Nugteren 7a2371213b Re-added support for local memory size constraint checking in the tuner 2018-03-21 22:58:37 +01:00
Cedric Nugteren 903deaf368 Fixed an issue for DLL linking under Windows 2018-03-10 16:45:31 +01:00
Cedric Nugteren 3d2ef9331b Fixed a few things for the new tuning API 2018-03-10 14:35:11 +01:00
Cedric Nugteren 0bdc51e47c Completed the API for all tuneable kernels 2018-03-10 10:54:44 +01:00
Cedric Nugteren 6397e61746 Added several more tuner API functions 2018-03-09 21:40:22 +01:00
Cedric Nugteren 49cc8b31ff Fixed compilation issue in Xger tuner 2018-03-06 20:59:23 +01:00
Cedric Nugteren 0e1a152023 First version of the tuning API, added interface for copy-kernel, added sample 2018-03-06 20:52:12 +01:00
Cedric Nugteren a1cedf36e3 Separate kernel tuners in .cpp with main and .hpp with settings 2018-03-03 16:37:31 +01:00
Cedric Nugteren 0557694d39 Fixed several issues in the new invert tuner 2018-02-20 20:53:13 +01:00
Cedric Nugteren 19fd263fb2 Moved some constants from global scope to a function; removed unnecessary includes 2018-01-25 20:00:43 +01:00
Cedric Nugteren 6a9d6b5da2 Changed the default number of runs for the GEMV tuner to fix issues for FP16 2018-01-25 19:57:36 +01:00
Cedric Nugteren c3f9371d16 Made GEMM routine tuning a bit more generic in preparation of possible separate batched tuning arguments 2018-01-18 19:41:59 +01:00
Cedric Nugteren 0e5eaa6eb9 Factored out the generic parts of the GEMM routine tuner 2018-01-15 21:32:51 +01:00
Cedric Nugteren c9b5d614e2 Fixed a vendor naming bug in the tuners and in the database 2018-01-06 17:02:58 +01:00
Cedric Nugteren ef71d8e9b5 Fixed unused variable warnings showing up with Clang 2017-12-23 16:07:26 +01:00
Cedric Nugteren 288766debb Now calling main TRSV routine again to fix compilation in MSVC 2017-12-23 14:49:21 +01:00
Cedric Nugteren 736399e528 Split the invert kernel in two parts to prevent error C1091 in MSVC 2013 2017-12-23 14:18:07 +01:00
Cedric Nugteren b1f52f130c Updated the database to use the new TRSV and Invert tuners 2017-12-23 13:55:22 +01:00
Cedric Nugteren aa7db4f987 Added TRSV block-size tuner 2017-12-23 13:34:57 +01:00
Cedric Nugteren 07a7012b0d Added skeleton for a tuner for the invert kernel 2017-12-19 21:10:48 +01:00
Cedric Nugteren 249bdaa8e9 Reformatted tuning code to make compilation faster 2017-12-18 21:34:07 +01:00
Cedric Nugteren e2f8068459 Fixed an issue with the tuner: it was using platform vendor rather than device vendor 2017-12-17 17:58:06 +01:00
Cedric Nugteren 7408f6e6eb Fixed an unnecessary overflow issue on 32-bit systems 2017-12-17 16:42:54 +01:00
Cedric Nugteren b4d3a50f19 Split GEMM kernel in 4 files instead of 3 due to MSVC 2013 string length limit 2017-12-10 16:09:09 +01:00
Cedric Nugteren c2f08fa346 Fixed an issue in the tuners to prevent error -14 from persisting (CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST) 2017-12-10 14:48:13 +01:00
Cedric Nugteren ca5dbcd2bd Made the pre-processor run by default for ARM and Qualcomm GPUs 2017-12-09 15:16:53 +01:00
Cedric Nugteren 13eb772343 Integrated pre-processor in compilation flow, default is still disabled 2017-11-30 21:32:47 +01:00
Cedric Nugteren e0f3484084 Fixes some displaying issues in the GEMM routine tuner 2017-11-20 20:29:52 +01:00
Cedric Nugteren 5467c0cac5 Fixed a variety of warnings and an error for MSVC2013 compilation 2017-11-19 21:09:24 +01:00
Cedric Nugteren 4e0d08c3bc Added compilation timing and better compilation error reporting 2017-11-19 16:58:13 +01:00
Cedric Nugteren a3a8b44f59 Some fixed for the new auto-tuner to be compatible with the Python scripts 2017-11-19 16:31:08 +01:00
Cedric Nugteren 76d2b7f0b6 Revived the GEMM routine tuner; minor formatting changes 2017-11-19 12:59:52 +01:00
Cedric Nugteren 7a54494577 Modified the kernel tuners to use the newly integrated auto-tuner 2017-11-19 12:58:41 +01:00
Cedric Nugteren 8a5a5e031e Moved some tuning functions from .hpp to .cpp 2017-11-17 20:58:36 +01:00
Cedric Nugteren f94d498a37 Moved compilation function to separate file; removed dependency of tuners of the CLBlast library 2017-11-17 20:57:46 +01:00
Cedric Nugteren 2b8ad70b63 Added printing of the best parameters for the new tuner 2017-11-16 21:18:29 +01:00
Cedric Nugteren 1b2b46f2f0 Added first version of integrated and re-written auto-tuner 2017-11-15 22:49:35 +01:00
Cedric Nugteren 34a33b54cf Changed GEMM routine tuner's scoring to use L2 measure instead for better averaging 2017-11-06 20:50:36 +01:00
Cedric Nugteren 9b0a435fb0 Integrated the GEMM routine tuner for kernel selection; added first tuning results 2017-11-02 21:47:14 +01:00
Cedric Nugteren 5c90577dfd Added collecting and printing of scores for the kernel-selection tuner 2017-10-30 20:39:21 +01:00
Cedric Nugteren 334a26eb12 Added initial version of a GEMM kernel selection tuner 2017-10-28 17:30:29 +02:00
Cedric Nugteren 375193fe4e Gemm in-direct implementation now uses only 1 larger instead of max 3 optional temporary buffers 2017-10-03 21:55:21 +02:00
Cedric Nugteren c151ab1325 Refactored the tuning architecture: less duplicate now; more defaults 2017-09-30 20:26:26 +02:00
Cedric Nugteren 76382ff6c1 Added the new vendor-architecture-name hierarchy to the tuners as well 2017-09-10 16:34:54 +02:00
Cedric Nugteren 54e160cd88 Fixed some things in the tuner: bugs, style, and defaults to random search 2017-08-31 20:28:01 +02:00
Cedric Nugteren da28cc5e93 Minor updates after merging in the PSO addition to the tuners 2017-08-21 20:14:02 +02:00
mcian dfd332524a Remove multistrategy and related functions 2017-08-21 14:09:11 +02:00