Cedric Nugteren
|
63996eb68b
|
Updated pyclblast to 1.1.0 and uploaded to PyPi
|
2018-03-30 10:38:36 +02:00 |
|
Cedric Nugteren
|
4de220a7a2
|
Merge pull request #255 from kodonnell/py_override
Adding override parameters to pyclblast
|
2018-03-30 10:28:00 +02:00 |
|
Cedric Nugteren
|
d86ff75fa5
|
Added argument checking for the GEMM tuner: expects m/n to be multiples of MWG/NWG
|
2018-03-30 10:23:33 +02:00 |
|
Cedric Nugteren
|
7e69c422af
|
Updated the roadmap
|
2018-03-30 10:05:16 +02:00 |
|
Cedric Nugteren
|
bb0889fa7a
|
Merge branch 'CLBlast-227-vivante-compiler-errors'
|
2018-03-30 09:22:09 +02:00 |
|
kodonell
|
173a7eb928
|
merged
|
2018-03-27 08:55:39 +13:00 |
|
kodonell
|
d16f2d1317
|
got the generator thing working
|
2018-03-27 08:45:54 +13:00 |
|
kodonell
|
f07c2a29b8
|
moved override_parameters example out of sgemm example
|
2018-03-27 08:30:58 +13:00 |
|
kodonell
|
58e70c56f1
|
tidying up pyclblast override_parameters api, and added example
|
2018-03-26 08:51:55 +13:00 |
|
Cedric Nugteren
|
1cbe2ea301
|
Removed arrays as function argument from GEMM kernels for Vivante OpenCL compiler
|
2018-03-23 20:29:20 +01:00 |
|
Cedric Nugteren
|
a97d8a0197
|
Merge pull request #269 from CNugteren/CLBlast-266-local-mem-constraint
CLBlast #266 local mem constraint
|
2018-03-22 22:42:33 +01:00 |
|
Cedric Nugteren
|
9fb6550dd0
|
Added the OpenCL local memory size constraint to the tuners
|
2018-03-22 21:01:02 +01:00 |
|
Cedric Nugteren
|
7a2371213b
|
Re-added support for local memory size constraint checking in the tuner
|
2018-03-21 22:58:37 +01:00 |
|
Cedric Nugteren
|
52791bf355
|
Fixed a failing TRSM test using a CPU with Apple OpenCL
|
2018-03-15 21:09:52 +01:00 |
|
Cedric Nugteren
|
7a756cbce7
|
Fixed a failing TRSV test using a CPU with Apple OpenCL
|
2018-03-15 20:58:42 +01:00 |
|
Cedric Nugteren
|
f4d96e80c3
|
Fixed breaking preprocessor test on certain platforms due to empty kernel string
|
2018-03-15 20:45:41 +01:00 |
|
Cedric Nugteren
|
9ff6cd7547
|
Added queue-finish commands to PyCLBlast samples and tests
|
2018-03-15 20:37:48 +01:00 |
|
Cedric Nugteren
|
934893972e
|
Merge pull request #262 from CNugteren/CLBlast-237-tuning-api
CLBlast #237: Tuning API
|
2018-03-11 15:38:33 +01:00 |
|
Cedric Nugteren
|
bcf1208431
|
Added basic tests for PyCLBlast
|
2018-03-11 15:32:36 +01:00 |
|
Cedric Nugteren
|
0dd1bc6f48
|
Made benchmarking script also work for complex numbers
|
2018-03-10 17:03:57 +01:00 |
|
Cedric Nugteren
|
49b02ec194
|
Added initial glossary
|
2018-03-10 17:02:38 +01:00 |
|
Cedric Nugteren
|
86455841d1
|
Added badge for OSX-Intel-CPU builds
|
2018-03-10 16:49:36 +01:00 |
|
Cedric Nugteren
|
903deaf368
|
Fixed an issue for DLL linking under Windows
|
2018-03-10 16:45:31 +01:00 |
|
Cedric Nugteren
|
e7dccfa3cc
|
Fixed an issue for DLL linking under Windows
|
2018-03-10 14:57:36 +01:00 |
|
Cedric Nugteren
|
54bbc99273
|
Updated the documentation for the tuner API
|
2018-03-10 14:52:40 +01:00 |
|
Cedric Nugteren
|
3d2ef9331b
|
Fixed a few things for the new tuning API
|
2018-03-10 14:35:11 +01:00 |
|
Cedric Nugteren
|
0bdc51e47c
|
Completed the API for all tuneable kernels
|
2018-03-10 10:54:44 +01:00 |
|
kodonell
|
c6056da0c8
|
ok, device id working
|
2018-03-10 22:21:30 +13:00 |
|
Cedric Nugteren
|
6397e61746
|
Added several more tuner API functions
|
2018-03-09 21:40:22 +01:00 |
|
kodonell
|
54a4b871b3
|
initial add of override parameters to pyclblast - cython not complaining, but segfault
|
2018-03-09 15:27:33 +13:00 |
|
Cedric Nugteren
|
49cc8b31ff
|
Fixed compilation issue in Xger tuner
|
2018-03-06 20:59:23 +01:00 |
|
Cedric Nugteren
|
0e1a152023
|
First version of the tuning API, added interface for copy-kernel, added sample
|
2018-03-06 20:52:12 +01:00 |
|
Cedric Nugteren
|
a1cedf36e3
|
Separate kernel tuners in .cpp with main and .hpp with settings
|
2018-03-03 16:37:31 +01:00 |
|
Cedric Nugteren
|
269bddbf34
|
Fixed the buildbot badges in the README
|
2018-03-03 13:12:09 +01:00 |
|
Cedric Nugteren
|
1aef354577
|
Updated documentation and build badges
|
2018-03-03 10:57:06 +01:00 |
|
Cedric Nugteren
|
bff64917bd
|
Fixed some small issues regarding PR#253
|
2018-03-03 10:43:12 +01:00 |
|
Cedric Nugteren
|
e3384be0d0
|
Merge pull request #253 from sivagnanamn/master
Added C API for getting GEMM temp buffer size
|
2018-03-03 10:29:47 +01:00 |
|
sivagnanamn
|
1433dc67f1
|
Added C API for getting GEMM temp buffer size
|
2018-03-03 03:00:17 +09:00 |
|
Cedric Nugteren
|
1940e67009
|
Updated the changelog
|
2018-02-26 19:53:50 +01:00 |
|
Cedric Nugteren
|
be8d95bd03
|
Added a note on preventing segfaults with OpenCL using the AMD APP SDK
|
2018-02-26 19:52:53 +01:00 |
|
Cedric Nugteren
|
d6fa188abf
|
Merge pull request #249 from CNugteren/documation_reorg
Documation update
|
2018-02-25 18:21:47 +01:00 |
|
Cedric Nugteren
|
c10bbff751
|
Fixed Ubuntu PPA package name
|
2018-02-25 16:15:20 +01:00 |
|
Cedric Nugteren
|
11f765c16c
|
Generated function signatures/inspect for PyCLBlast
|
2018-02-25 15:31:38 +01:00 |
|
Cedric Nugteren
|
13dc26e63d
|
Generated PyCLBlast docstrings
|
2018-02-25 15:30:57 +01:00 |
|
Cedric Nugteren
|
6710c60935
|
Some style improvements in the pyclblast code generator
|
2018-02-25 14:51:58 +01:00 |
|
Cedric Nugteren
|
9699169cdf
|
Added API documentation for two missing C++ functions
|
2018-02-25 14:44:22 +01:00 |
|
Cedric Nugteren
|
ced830539e
|
Split the documentation and updated where needed
|
2018-02-24 21:11:28 +01:00 |
|
Cedric Nugteren
|
e784df0230
|
Renamed the API documentation
|
2018-02-24 20:46:44 +01:00 |
|
Cedric Nugteren
|
45cc809879
|
Updated the roadmap
|
2018-02-24 20:46:14 +01:00 |
|
Cedric Nugteren
|
39175836a9
|
Merge pull request #248 from kpot/patch-1
Fix of multiple duplicates in documentation
|
2018-02-21 19:30:00 +01:00 |
|