Commit graph

604 commits

Author SHA1 Message Date
Cedric Nugteren f14e6f87d2 Updated tuning results for the Skylake ULT GT2 GPU with the new kernel 2018-04-15 11:45:45 +02:00
Cedric Nugteren 0dff7f1ac4 Made GEMM rotation expectations kernel-specific 2018-04-13 22:27:11 +02:00
Cedric Nugteren 0f49dd24e5 Updated database with defaults of GEMMK=0 and KREG=1 2018-04-10 21:26:18 +02:00
Cedric Nugteren 77ba11f686 Extended the maximum number of tuning parameters from 14 to 16 2018-04-08 18:12:54 +02:00
Cedric Nugteren a93fec1026 Fixed issues with the pre-processor 2018-04-08 18:02:44 +02:00
Cedric Nugteren 7cbc6b7495 Merge branch 'master' into CLBlast-228-2d-register-gemm-kernel 2018-04-07 17:51:40 +02:00
Cedric Nugteren 16f7f49683 Added tuning results for NVIDIA GeForce 970 2018-04-07 17:48:25 +02:00
Cedric Nugteren 9596e46d01 Added tuning results for NVIDIA GeForce 920MX 2018-04-07 17:44:32 +02:00
Cedric Nugteren 048fe90e57 Added tuning results for Intel HD Graphics 620 2018-04-07 17:33:57 +02:00
Cedric Nugteren 3519d32ac4 Extended the GEMM tuner to be able to tune the new 'kernel 1' 2018-04-07 17:05:44 +02:00
Cedric Nugteren 381f1fe67a Fixed a compilation issue for complex datatypes and vload 2018-04-07 16:57:36 +02:00
Cedric Nugteren 2a29dc061c Fixed a compilation issue for complex datatypes and vload 2018-04-06 21:06:13 +02:00
Cedric Nugteren eae25f5727 Added first version of 2D register tiling kernel with A and C transposed as well 2018-04-03 21:18:40 +02:00
Cedric Nugteren 63996eb68b Updated pyclblast to 1.1.0 and uploaded to PyPi 2018-03-30 10:38:36 +02:00
Cedric Nugteren 4de220a7a2
Merge pull request #255 from kodonnell/py_override
Adding override parameters to pyclblast
2018-03-30 10:28:00 +02:00
Cedric Nugteren d86ff75fa5 Added argument checking for the GEMM tuner: expects m/n to be multiples of MWG/NWG 2018-03-30 10:23:33 +02:00
Cedric Nugteren bb0889fa7a Merge branch 'CLBlast-227-vivante-compiler-errors' 2018-03-30 09:22:09 +02:00
kodonell 173a7eb928 merged 2018-03-27 08:55:39 +13:00
kodonell f07c2a29b8 moved override_parameters example out of sgemm example 2018-03-27 08:30:58 +13:00
kodonell 58e70c56f1 tidying up pyclblast override_parameters api, and added example 2018-03-26 08:51:55 +13:00
Cedric Nugteren 1cbe2ea301 Removed arrays as function argument from GEMM kernels for Vivante OpenCL compiler 2018-03-23 20:29:20 +01:00
Cedric Nugteren 9fb6550dd0 Added the OpenCL local memory size constraint to the tuners 2018-03-22 21:01:02 +01:00
Cedric Nugteren 7a2371213b Re-added support for local memory size constraint checking in the tuner 2018-03-21 22:58:37 +01:00
Cedric Nugteren 52791bf355 Fixed a failing TRSM test using a CPU with Apple OpenCL 2018-03-15 21:09:52 +01:00
Cedric Nugteren 7a756cbce7 Fixed a failing TRSV test using a CPU with Apple OpenCL 2018-03-15 20:58:42 +01:00
Cedric Nugteren 9ff6cd7547 Added queue-finish commands to PyCLBlast samples and tests 2018-03-15 20:37:48 +01:00
Cedric Nugteren 934893972e
Merge pull request #262 from CNugteren/CLBlast-237-tuning-api
CLBlast #237: Tuning API
2018-03-11 15:38:33 +01:00
Cedric Nugteren bcf1208431 Added basic tests for PyCLBlast 2018-03-11 15:32:36 +01:00
Cedric Nugteren 903deaf368 Fixed an issue for DLL linking under Windows 2018-03-10 16:45:31 +01:00
Cedric Nugteren 3d2ef9331b Fixed a few things for the new tuning API 2018-03-10 14:35:11 +01:00
Cedric Nugteren 0bdc51e47c Completed the API for all tuneable kernels 2018-03-10 10:54:44 +01:00
kodonell c6056da0c8 ok, device id working 2018-03-10 22:21:30 +13:00
Cedric Nugteren 6397e61746 Added several more tuner API functions 2018-03-09 21:40:22 +01:00
kodonell 54a4b871b3 initial add of override parameters to pyclblast - cython not complaining, but segfault 2018-03-09 15:27:33 +13:00
Cedric Nugteren 49cc8b31ff Fixed compilation issue in Xger tuner 2018-03-06 20:59:23 +01:00
Cedric Nugteren 0e1a152023 First version of the tuning API, added interface for copy-kernel, added sample 2018-03-06 20:52:12 +01:00
Cedric Nugteren a1cedf36e3 Separate kernel tuners in .cpp with main and .hpp with settings 2018-03-03 16:37:31 +01:00
Cedric Nugteren bff64917bd Fixed some small issues regarding PR#253 2018-03-03 10:43:12 +01:00
sivagnanamn 1433dc67f1 Added C API for getting GEMM temp buffer size 2018-03-03 03:00:17 +09:00
Cedric Nugteren 11f765c16c Generated function signatures/inspect for PyCLBlast 2018-02-25 15:31:38 +01:00
Cedric Nugteren 13dc26e63d Generated PyCLBlast docstrings 2018-02-25 15:30:57 +01:00
Cedric Nugteren 0557694d39 Fixed several issues in the new invert tuner 2018-02-20 20:53:13 +01:00
Cedric Nugteren fc10a4baca Set initial pyclblast to be version 1.0.0 2018-02-18 20:19:19 +01:00
Cedric Nugteren ce5e2a1e00 Prepared PyCLBlast for release as a package on PyPi 2018-02-18 18:01:02 +01:00
Cedric Nugteren 76c21a95c2 Added PyCLBlast samples 2018-02-18 17:59:43 +01:00
Cedric Nugteren a66e24a009 Added all other level 1/2/3 routines to pyclblast 2018-02-18 17:34:10 +01:00
Cedric Nugteren e1bfb40827 Added GEMM to the Python wrapper 2018-02-18 16:33:20 +01:00
Cedric Nugteren eb85f6b514 First agenerated version (clblastXswap only for now) of the pyclblast wrapper 2018-02-14 20:50:47 +01:00
Cedric Nugteren 61b8c771ed Added skeleton for Python interface using Cython 2018-02-13 21:42:32 +01:00
Cedric Nugteren 70d0fe89c6 Fixed a minor typo 2018-02-11 15:31:08 +01:00