Commit graph

82 commits

Author SHA1 Message Date
Cedric Nugteren 5f97d64505 Update API documentation 2020-03-08 11:29:47 +01:00
Cedric Nugteren 7084311e45 Added tuning parameters for Tesla P100 16GB 2019-02-09 16:31:48 +01:00
Cedric Nugteren 1035e533cd Added tuning parameters for Xeon E5-2630 v3 and v4 2019-02-09 16:29:30 +01:00
Cedric Nugteren 11f4c7dd93 Added documentation on the convgemm routine 2019-01-19 15:44:19 +01:00
Koichi Akabe c0883cf2fe Update the documentation 2018-12-18 14:08:16 +09:00
Cedric Nugteren 4676ec2921 Added a FAQ document 2018-12-01 17:19:28 +01:00
Koichi Akabe 032e3b0cc0 Add kernel_mode option to im2col, col2im, and convgemm functions 2018-11-12 10:12:07 +09:00
Cedric Nugteren 6f67525ea6 Changed col2im to append to the existing im-buffer 2018-11-07 19:45:07 +01:00
Cedric Nugteren 2d32a23293 Added new col2im routine to the documentation 2018-11-01 21:46:19 +01:00
Cedric Nugteren d45911b61d Added groundwork for col2im algorithm plus first non-working version of kernel and test 2018-10-23 20:52:25 +02:00
Cedric Nugteren 634b2bc75c
Merge pull request #319 from CNugteren/convgemm_multi_kernel
First im2col+GEMM implementation of convolution
2018-10-14 17:27:45 +02:00
Cedric Nugteren 8676b62178 Updated the documentation for GEMV tuning 2018-10-13 17:43:51 +02:00
Cedric Nugteren 83ba3d4b7b Merge branch 'master' into convgemm_multi_kernel 2018-09-16 20:01:18 +02:00
Cedric Nugteren 91dbd580ab Added a kernel-parameter pair table to document the tuning API 2018-09-15 18:47:31 +02:00
Cedric Nugteren c788e040f7 Added xCONVGEMM as im2col plus a batched GEMM kernel 2018-09-07 22:02:44 +02:00
Hendrik Ranocha faed209f30
Add Julia Wrapper
I've written a wrapper of CLBlast in Julia which can be found [here](https://github.com/JuliaGPU/CLBlast.jl). It is published and available using the Julia package manager.
2018-09-03 15:57:16 +02:00
Cedric Nugteren 2dd539f911 Removed complex numbers support for CONVGEMM 2018-07-29 10:37:14 +02:00
Cedric Nugteren 5903820ba2 Merge branch 'master' into CLBlast-267-convgemm 2018-07-29 10:26:34 +02:00
Cedric Nugteren db179a1e40 Updated to CLBlast version 1.4.1 2018-07-14 12:29:06 +02:00
Cedric Nugteren f72620f474 Added tuning results for Intel i5-4970S 2018-07-13 21:25:21 +02:00
Cedric Nugteren 08b1417956 Added tuning results for GeForce GTX 1070 Ti 2018-07-13 21:07:32 +02:00
Cedric Nugteren c459582c4f Added tuning results for HD Graphics 6000 Broadwell GT3 2018-07-13 21:05:43 +02:00
Cedric Nugteren 1c9a741470 Merge branch 'master' into CLBlast-267-convgemm 2018-06-03 15:53:27 +02:00
Cedric Nugteren fee8df153c Added list of tuners to be run by 'alltuners' target 2018-06-03 10:42:15 +02:00
Cedric Nugteren cbcd4ff7e8 Merge branch 'master' into CLBlast-267-convgemm 2018-05-19 17:54:27 +02:00
Cedric Nugteren 66583b3cda The GEMM routine tuner now loads kernel JSON tuning results from disk if available; now run part of alltuners target 2018-05-19 12:48:59 +02:00
Cedric Nugteren ad57a45039 Added documentation on some details of the GEMM implementation 2018-05-17 12:50:03 +02:00
Cedric Nugteren a4119531ee Updated the documentation for convgemm to include data layout (NCHW) 2018-05-09 17:46:27 +02:00
Cedric Nugteren 2d1f6ba7fe Added convgemm skeleton, test infrastructure, and first reference implementation 2018-05-06 11:35:34 +02:00
Cedric Nugteren 2776d76176 Added interface of batched convolution as GEMM 2018-05-05 14:06:33 +02:00
Cedric Nugteren 16f7f49683 Added tuning results for NVIDIA GeForce 970 2018-04-07 17:48:25 +02:00
Cedric Nugteren 9596e46d01 Added tuning results for NVIDIA GeForce 920MX 2018-04-07 17:44:32 +02:00
Cedric Nugteren 7e69c422af Updated the roadmap 2018-03-30 10:05:16 +02:00
Cedric Nugteren 934893972e
Merge pull request #262 from CNugteren/CLBlast-237-tuning-api
CLBlast #237: Tuning API
2018-03-11 15:38:33 +01:00
Cedric Nugteren 49b02ec194 Added initial glossary 2018-03-10 17:02:38 +01:00
Cedric Nugteren 54bbc99273 Updated the documentation for the tuner API 2018-03-10 14:52:40 +01:00
Cedric Nugteren 1aef354577 Updated documentation and build badges 2018-03-03 10:57:06 +01:00
Cedric Nugteren bff64917bd Fixed some small issues regarding PR#253 2018-03-03 10:43:12 +01:00
sivagnanamn 1433dc67f1 Added C API for getting GEMM temp buffer size 2018-03-03 03:00:17 +09:00
Cedric Nugteren be8d95bd03 Added a note on preventing segfaults with OpenCL using the AMD APP SDK 2018-02-26 19:52:53 +01:00
Cedric Nugteren c10bbff751 Fixed Ubuntu PPA package name 2018-02-25 16:15:20 +01:00
Cedric Nugteren 9699169cdf Added API documentation for two missing C++ functions 2018-02-25 14:44:22 +01:00
Cedric Nugteren ced830539e Split the documentation and updated where needed 2018-02-24 21:11:28 +01:00
Cedric Nugteren e784df0230 Renamed the API documentation 2018-02-24 20:46:44 +01:00
Kirill Mavreshko 5463bd5c44
Fix of multiple duplicates in documentation 2018-02-20 18:10:31 +05:00
Cedric Nugteren ae66782eab Fixed the XHAD documentation 2018-02-02 21:12:07 +01:00
Cedric Nugteren ef5008f5e4 Created the API and stubs for the HAD (hadamard-product) routines 2018-01-31 20:41:02 +01:00
Cedric Nugteren 9fb2c61b25 Added API and tests for new GemmStridedBatched routine 2018-01-07 14:27:15 +01:00
Cedric Nugteren af14fff1e9 Updated the generator script to automatically generate the temp-buffer code 2018-01-04 19:31:57 +01:00
Cedric Nugteren 84ec50e29d Added interface and stubs for the im2col routine 2017-07-02 12:10:22 +02:00