Commit graph

1140 commits

Author SHA1 Message Date
Cedric Nugteren f6a48f05ed Made it possible to add tuning parameters to the database using the script 2018-04-10 21:24:36 +02:00
Cedric Nugteren 3fbbb81137 Fixed a bug in the compression part of the database script 2018-04-10 21:18:11 +02:00
Cedric Nugteren 77ba11f686 Extended the maximum number of tuning parameters from 14 to 16 2018-04-08 18:12:54 +02:00
Cedric Nugteren a93fec1026 Fixed issues with the pre-processor 2018-04-08 18:02:44 +02:00
Cedric Nugteren 7cbc6b7495 Merge branch 'master' into CLBlast-228-2d-register-gemm-kernel 2018-04-07 17:51:40 +02:00
Cedric Nugteren 16f7f49683 Added tuning results for NVIDIA GeForce 970 2018-04-07 17:48:25 +02:00
Cedric Nugteren 9596e46d01 Added tuning results for NVIDIA GeForce 920MX 2018-04-07 17:44:32 +02:00
Cedric Nugteren cf7965dc68 Fixed a python3 import error issue with the database script 2018-04-07 17:40:43 +02:00
Cedric Nugteren 048fe90e57 Added tuning results for Intel HD Graphics 620 2018-04-07 17:33:57 +02:00
Cedric Nugteren 3519d32ac4 Extended the GEMM tuner to be able to tune the new 'kernel 1' 2018-04-07 17:05:44 +02:00
Cedric Nugteren 381f1fe67a Fixed a compilation issue for complex datatypes and vload 2018-04-07 16:57:36 +02:00
Cedric Nugteren 2a29dc061c Fixed a compilation issue for complex datatypes and vload 2018-04-06 21:06:13 +02:00
Cedric Nugteren eae25f5727 Added first version of 2D register tiling kernel with A and C transposed as well 2018-04-03 21:18:40 +02:00
Cedric Nugteren 63996eb68b Updated pyclblast to 1.1.0 and uploaded to PyPi 2018-03-30 10:38:36 +02:00
Cedric Nugteren 4de220a7a2
Merge pull request #255 from kodonnell/py_override
Adding override parameters to pyclblast
2018-03-30 10:28:00 +02:00
Cedric Nugteren d86ff75fa5 Added argument checking for the GEMM tuner: expects m/n to be multiples of MWG/NWG 2018-03-30 10:23:33 +02:00
Cedric Nugteren 7e69c422af Updated the roadmap 2018-03-30 10:05:16 +02:00
Cedric Nugteren bb0889fa7a Merge branch 'CLBlast-227-vivante-compiler-errors' 2018-03-30 09:22:09 +02:00
kodonell 173a7eb928 merged 2018-03-27 08:55:39 +13:00
kodonell d16f2d1317 got the generator thing working 2018-03-27 08:45:54 +13:00
kodonell f07c2a29b8 moved override_parameters example out of sgemm example 2018-03-27 08:30:58 +13:00
kodonell 58e70c56f1 tidying up pyclblast override_parameters api, and added example 2018-03-26 08:51:55 +13:00
Cedric Nugteren 1cbe2ea301 Removed arrays as function argument from GEMM kernels for Vivante OpenCL compiler 2018-03-23 20:29:20 +01:00
Cedric Nugteren a97d8a0197
Merge pull request #269 from CNugteren/CLBlast-266-local-mem-constraint
CLBlast #266 local mem constraint
2018-03-22 22:42:33 +01:00
Cedric Nugteren 9fb6550dd0 Added the OpenCL local memory size constraint to the tuners 2018-03-22 21:01:02 +01:00
Cedric Nugteren 7a2371213b Re-added support for local memory size constraint checking in the tuner 2018-03-21 22:58:37 +01:00
Cedric Nugteren 52791bf355 Fixed a failing TRSM test using a CPU with Apple OpenCL 2018-03-15 21:09:52 +01:00
Cedric Nugteren 7a756cbce7 Fixed a failing TRSV test using a CPU with Apple OpenCL 2018-03-15 20:58:42 +01:00
Cedric Nugteren f4d96e80c3 Fixed breaking preprocessor test on certain platforms due to empty kernel string 2018-03-15 20:45:41 +01:00
Cedric Nugteren 9ff6cd7547 Added queue-finish commands to PyCLBlast samples and tests 2018-03-15 20:37:48 +01:00
Cedric Nugteren 934893972e
Merge pull request #262 from CNugteren/CLBlast-237-tuning-api
CLBlast #237: Tuning API
2018-03-11 15:38:33 +01:00
Cedric Nugteren bcf1208431 Added basic tests for PyCLBlast 2018-03-11 15:32:36 +01:00
Cedric Nugteren 0dd1bc6f48 Made benchmarking script also work for complex numbers 2018-03-10 17:03:57 +01:00
Cedric Nugteren 49b02ec194 Added initial glossary 2018-03-10 17:02:38 +01:00
Cedric Nugteren 86455841d1 Added badge for OSX-Intel-CPU builds 2018-03-10 16:49:36 +01:00
Cedric Nugteren 903deaf368 Fixed an issue for DLL linking under Windows 2018-03-10 16:45:31 +01:00
Cedric Nugteren e7dccfa3cc Fixed an issue for DLL linking under Windows 2018-03-10 14:57:36 +01:00
Cedric Nugteren 54bbc99273 Updated the documentation for the tuner API 2018-03-10 14:52:40 +01:00
Cedric Nugteren 3d2ef9331b Fixed a few things for the new tuning API 2018-03-10 14:35:11 +01:00
Cedric Nugteren 0bdc51e47c Completed the API for all tuneable kernels 2018-03-10 10:54:44 +01:00
kodonell c6056da0c8 ok, device id working 2018-03-10 22:21:30 +13:00
Cedric Nugteren 6397e61746 Added several more tuner API functions 2018-03-09 21:40:22 +01:00
kodonell 54a4b871b3 initial add of override parameters to pyclblast - cython not complaining, but segfault 2018-03-09 15:27:33 +13:00
Cedric Nugteren 49cc8b31ff Fixed compilation issue in Xger tuner 2018-03-06 20:59:23 +01:00
Cedric Nugteren 0e1a152023 First version of the tuning API, added interface for copy-kernel, added sample 2018-03-06 20:52:12 +01:00
Cedric Nugteren a1cedf36e3 Separate kernel tuners in .cpp with main and .hpp with settings 2018-03-03 16:37:31 +01:00
Cedric Nugteren 269bddbf34 Fixed the buildbot badges in the README 2018-03-03 13:12:09 +01:00
Cedric Nugteren 1aef354577 Updated documentation and build badges 2018-03-03 10:57:06 +01:00
Cedric Nugteren bff64917bd Fixed some small issues regarding PR#253 2018-03-03 10:43:12 +01:00
Cedric Nugteren e3384be0d0
Merge pull request #253 from sivagnanamn/master
Added C API for getting GEMM temp buffer size
2018-03-03 10:29:47 +01:00