Commit graph

1096 commits

Author SHA1 Message Date
Cedric Nugteren 0e1a152023 First version of the tuning API, added interface for copy-kernel, added sample 2018-03-06 20:52:12 +01:00
Cedric Nugteren a1cedf36e3 Separate kernel tuners in .cpp with main and .hpp with settings 2018-03-03 16:37:31 +01:00
Cedric Nugteren 269bddbf34 Fixed the buildbot badges in the README 2018-03-03 13:12:09 +01:00
Cedric Nugteren 1aef354577 Updated documentation and build badges 2018-03-03 10:57:06 +01:00
Cedric Nugteren bff64917bd Fixed some small issues regarding PR#253 2018-03-03 10:43:12 +01:00
Cedric Nugteren e3384be0d0
Merge pull request #253 from sivagnanamn/master
Added C API for getting GEMM temp buffer size
2018-03-03 10:29:47 +01:00
sivagnanamn 1433dc67f1 Added C API for getting GEMM temp buffer size 2018-03-03 03:00:17 +09:00
Cedric Nugteren 1940e67009 Updated the changelog 2018-02-26 19:53:50 +01:00
Cedric Nugteren be8d95bd03 Added a note on preventing segfaults with OpenCL using the AMD APP SDK 2018-02-26 19:52:53 +01:00
Cedric Nugteren d6fa188abf
Merge pull request #249 from CNugteren/documation_reorg
Documation update
2018-02-25 18:21:47 +01:00
Cedric Nugteren c10bbff751 Fixed Ubuntu PPA package name 2018-02-25 16:15:20 +01:00
Cedric Nugteren 11f765c16c Generated function signatures/inspect for PyCLBlast 2018-02-25 15:31:38 +01:00
Cedric Nugteren 13dc26e63d Generated PyCLBlast docstrings 2018-02-25 15:30:57 +01:00
Cedric Nugteren 6710c60935 Some style improvements in the pyclblast code generator 2018-02-25 14:51:58 +01:00
Cedric Nugteren 9699169cdf Added API documentation for two missing C++ functions 2018-02-25 14:44:22 +01:00
Cedric Nugteren ced830539e Split the documentation and updated where needed 2018-02-24 21:11:28 +01:00
Cedric Nugteren e784df0230 Renamed the API documentation 2018-02-24 20:46:44 +01:00
Cedric Nugteren 45cc809879 Updated the roadmap 2018-02-24 20:46:14 +01:00
Cedric Nugteren 39175836a9
Merge pull request #248 from kpot/patch-1
Fix of multiple duplicates in documentation
2018-02-21 19:30:00 +01:00
Kirill Mavreshko e300ad3292 Fixed duplication of parameter descriptions by the doc generator 2018-02-21 14:18:45 +05:00
Cedric Nugteren 0557694d39 Fixed several issues in the new invert tuner 2018-02-20 20:53:13 +01:00
Kirill Mavreshko 5463bd5c44
Fix of multiple duplicates in documentation 2018-02-20 18:10:31 +05:00
Cedric Nugteren f8c8d167bb
Merge pull request #247 from CNugteren/CLBlast-223-python-interface
PyCLBlast: a Python interface for CLBlast based on PyOpenCL
2018-02-18 20:44:00 +01:00
Cedric Nugteren fc10a4baca Set initial pyclblast to be version 1.0.0 2018-02-18 20:19:19 +01:00
Cedric Nugteren c3a3976b7d Updated changelog and roadmap: Python package created 2018-02-18 18:01:26 +01:00
Cedric Nugteren ce5e2a1e00 Prepared PyCLBlast for release as a package on PyPi 2018-02-18 18:01:02 +01:00
Cedric Nugteren 76c21a95c2 Added PyCLBlast samples 2018-02-18 17:59:43 +01:00
Cedric Nugteren a66e24a009 Added all other level 1/2/3 routines to pyclblast 2018-02-18 17:34:10 +01:00
Cedric Nugteren e1bfb40827 Added GEMM to the Python wrapper 2018-02-18 16:33:20 +01:00
Cedric Nugteren eb85f6b514 First agenerated version (clblastXswap only for now) of the pyclblast wrapper 2018-02-14 20:50:47 +01:00
Cedric Nugteren 61b8c771ed Added skeleton for Python interface using Cython 2018-02-13 21:42:32 +01:00
Cedric Nugteren c5a28cd70b Added CLBlast version numbering to the compiled library 2018-02-11 15:31:21 +01:00
Cedric Nugteren 70d0fe89c6 Fixed a minor typo 2018-02-11 15:31:08 +01:00
Cedric Nugteren 101152568a
Merge pull request #246 from CNugteren/CLBlast-224-hadamard-product
Hadamard product
2018-02-03 13:18:03 +01:00
Cedric Nugteren 69ed46c8da Implemented the XHAD Hadamard product routine 2018-02-02 21:18:37 +01:00
Cedric Nugteren ae66782eab Fixed the XHAD documentation 2018-02-02 21:12:07 +01:00
Cedric Nugteren ef5008f5e4 Created the API and stubs for the HAD (hadamard-product) routines 2018-01-31 20:41:02 +01:00
Cedric Nugteren 37c5e8f58c Updated to CLBlast version 1.3.0 2018-01-29 20:45:21 +01:00
Cedric Nugteren f12c7fcdf2 Merge branch 'master' of github.com:CNugteren/CLBlast 2018-01-29 20:34:37 +01:00
Cedric Nugteren d1d80ca131 Fixed a compilation error of the kernel-preprocessor test under MSVC 2018-01-29 20:26:25 +01:00
Cedric Nugteren 97e92cb10c Updated the known issues 2018-01-28 14:50:03 +01:00
Cedric Nugteren 180532ea39 Some fixes to the benchmark scripts 2018-01-27 20:06:13 +01:00
Cedric Nugteren ada762f668 Minor displaying improvements to the graph plotting scripts 2018-01-26 20:38:11 +01:00
Cedric Nugteren caebe8a9d5 Fixed an event synchronisation issue in the batched gemm routines 2018-01-26 20:37:04 +01:00
Cedric Nugteren 3651b51664 Improved the benchmark scripts; added gemmstridedbatched benchmark 2018-01-25 21:24:18 +01:00
Cedric Nugteren 19fd263fb2 Moved some constants from global scope to a function; removed unnecessary includes 2018-01-25 20:00:43 +01:00
Cedric Nugteren 6a9d6b5da2 Changed the default number of runs for the GEMV tuner to fix issues for FP16 2018-01-25 19:57:36 +01:00
Cedric Nugteren b2c946c517
Merge pull request #244 from CNugteren/kernel_selection_batched_gemm
Kernel selection for batched GEMM
2018-01-20 10:19:28 +01:00
Cedric Nugteren c3f9371d16 Made GEMM routine tuning a bit more generic in preparation of possible separate batched tuning arguments 2018-01-18 19:41:59 +01:00
Cedric Nugteren bc54411d19 Made the batched routines also chose direct/indirect kernel like the main GEMM routine 2018-01-18 19:41:02 +01:00