Commit Graph

1128 Commits (048fe90e57a00ca33681feb716474e2046ec7f5b)

Author SHA1 Message Date
Cedric Nugteren 048fe90e57 Added tuning results for Intel HD Graphics 620 2018-04-07 17:33:57 +02:00
Cedric Nugteren 63996eb68b Updated pyclblast to 1.1.0 and uploaded to PyPi 2018-03-30 10:38:36 +02:00
Cedric Nugteren 4de220a7a2
Merge pull request #255 from kodonnell/py_override
Adding override parameters to pyclblast
2018-03-30 10:28:00 +02:00
Cedric Nugteren d86ff75fa5 Added argument checking for the GEMM tuner: expects m/n to be multiples of MWG/NWG 2018-03-30 10:23:33 +02:00
Cedric Nugteren 7e69c422af Updated the roadmap 2018-03-30 10:05:16 +02:00
Cedric Nugteren bb0889fa7a Merge branch 'CLBlast-227-vivante-compiler-errors' 2018-03-30 09:22:09 +02:00
kodonell 173a7eb928 merged 2018-03-27 08:55:39 +13:00
kodonell d16f2d1317 got the generator thing working 2018-03-27 08:45:54 +13:00
kodonell f07c2a29b8 moved override_parameters example out of sgemm example 2018-03-27 08:30:58 +13:00
kodonell 58e70c56f1 tidying up pyclblast override_parameters api, and added example 2018-03-26 08:51:55 +13:00
Cedric Nugteren 1cbe2ea301 Removed arrays as function argument from GEMM kernels for Vivante OpenCL compiler 2018-03-23 20:29:20 +01:00
Cedric Nugteren a97d8a0197
Merge pull request #269 from CNugteren/CLBlast-266-local-mem-constraint
CLBlast #266 local mem constraint
2018-03-22 22:42:33 +01:00
Cedric Nugteren 9fb6550dd0 Added the OpenCL local memory size constraint to the tuners 2018-03-22 21:01:02 +01:00
Cedric Nugteren 7a2371213b Re-added support for local memory size constraint checking in the tuner 2018-03-21 22:58:37 +01:00
Cedric Nugteren 52791bf355 Fixed a failing TRSM test using a CPU with Apple OpenCL 2018-03-15 21:09:52 +01:00
Cedric Nugteren 7a756cbce7 Fixed a failing TRSV test using a CPU with Apple OpenCL 2018-03-15 20:58:42 +01:00
Cedric Nugteren f4d96e80c3 Fixed breaking preprocessor test on certain platforms due to empty kernel string 2018-03-15 20:45:41 +01:00
Cedric Nugteren 9ff6cd7547 Added queue-finish commands to PyCLBlast samples and tests 2018-03-15 20:37:48 +01:00
Cedric Nugteren 934893972e
Merge pull request #262 from CNugteren/CLBlast-237-tuning-api
CLBlast #237: Tuning API
2018-03-11 15:38:33 +01:00
Cedric Nugteren bcf1208431 Added basic tests for PyCLBlast 2018-03-11 15:32:36 +01:00
Cedric Nugteren 0dd1bc6f48 Made benchmarking script also work for complex numbers 2018-03-10 17:03:57 +01:00
Cedric Nugteren 49b02ec194 Added initial glossary 2018-03-10 17:02:38 +01:00
Cedric Nugteren 86455841d1 Added badge for OSX-Intel-CPU builds 2018-03-10 16:49:36 +01:00
Cedric Nugteren 903deaf368 Fixed an issue for DLL linking under Windows 2018-03-10 16:45:31 +01:00
Cedric Nugteren e7dccfa3cc Fixed an issue for DLL linking under Windows 2018-03-10 14:57:36 +01:00
Cedric Nugteren 54bbc99273 Updated the documentation for the tuner API 2018-03-10 14:52:40 +01:00
Cedric Nugteren 3d2ef9331b Fixed a few things for the new tuning API 2018-03-10 14:35:11 +01:00
Cedric Nugteren 0bdc51e47c Completed the API for all tuneable kernels 2018-03-10 10:54:44 +01:00
kodonell c6056da0c8 ok, device id working 2018-03-10 22:21:30 +13:00
Cedric Nugteren 6397e61746 Added several more tuner API functions 2018-03-09 21:40:22 +01:00
kodonell 54a4b871b3 initial add of override parameters to pyclblast - cython not complaining, but segfault 2018-03-09 15:27:33 +13:00
Cedric Nugteren 49cc8b31ff Fixed compilation issue in Xger tuner 2018-03-06 20:59:23 +01:00
Cedric Nugteren 0e1a152023 First version of the tuning API, added interface for copy-kernel, added sample 2018-03-06 20:52:12 +01:00
Cedric Nugteren a1cedf36e3 Separate kernel tuners in .cpp with main and .hpp with settings 2018-03-03 16:37:31 +01:00
Cedric Nugteren 269bddbf34 Fixed the buildbot badges in the README 2018-03-03 13:12:09 +01:00
Cedric Nugteren 1aef354577 Updated documentation and build badges 2018-03-03 10:57:06 +01:00
Cedric Nugteren bff64917bd Fixed some small issues regarding PR#253 2018-03-03 10:43:12 +01:00
Cedric Nugteren e3384be0d0
Merge pull request #253 from sivagnanamn/master
Added C API for getting GEMM temp buffer size
2018-03-03 10:29:47 +01:00
sivagnanamn 1433dc67f1 Added C API for getting GEMM temp buffer size 2018-03-03 03:00:17 +09:00
Cedric Nugteren 1940e67009 Updated the changelog 2018-02-26 19:53:50 +01:00
Cedric Nugteren be8d95bd03 Added a note on preventing segfaults with OpenCL using the AMD APP SDK 2018-02-26 19:52:53 +01:00
Cedric Nugteren d6fa188abf
Merge pull request #249 from CNugteren/documation_reorg
Documation update
2018-02-25 18:21:47 +01:00
Cedric Nugteren c10bbff751 Fixed Ubuntu PPA package name 2018-02-25 16:15:20 +01:00
Cedric Nugteren 11f765c16c Generated function signatures/inspect for PyCLBlast 2018-02-25 15:31:38 +01:00
Cedric Nugteren 13dc26e63d Generated PyCLBlast docstrings 2018-02-25 15:30:57 +01:00
Cedric Nugteren 6710c60935 Some style improvements in the pyclblast code generator 2018-02-25 14:51:58 +01:00
Cedric Nugteren 9699169cdf Added API documentation for two missing C++ functions 2018-02-25 14:44:22 +01:00
Cedric Nugteren ced830539e Split the documentation and updated where needed 2018-02-24 21:11:28 +01:00
Cedric Nugteren e784df0230 Renamed the API documentation 2018-02-24 20:46:44 +01:00
Cedric Nugteren 45cc809879 Updated the roadmap 2018-02-24 20:46:14 +01:00