Koichi Akabe
|
d9db543d75
|
Fix half-float+kernel_mode test cases of im2col, col2im, and convgemm
|
2018-12-17 21:57:35 +09:00 |
|
Koichi Akabe
|
032e3b0cc0
|
Add kernel_mode option to im2col, col2im, and convgemm functions
|
2018-11-12 10:12:07 +09:00 |
|
Cedric Nugteren
|
6f67525ea6
|
Changed col2im to append to the existing im-buffer
|
2018-11-07 19:45:07 +01:00 |
|
Cedric Nugteren
|
469c346a8e
|
Fixed half-precision tests for im2col and col2im
|
2018-11-01 21:44:21 +01:00 |
|
Koichi Akabe
|
0b3d04f709
|
Fix col2im implementation
|
2018-10-30 14:54:55 +09:00 |
|
Cedric Nugteren
|
d45911b61d
|
Added groundwork for col2im algorithm plus first non-working version of kernel and test
|
2018-10-23 20:52:25 +02:00 |
|
Cedric Nugteren
|
44b630fc22
|
Some name changes in im2col code
|
2018-10-22 22:12:58 +02:00 |
|
Cedric Nugteren
|
83ba3d4b7b
|
Merge branch 'master' into convgemm_multi_kernel
|
2018-09-16 20:01:18 +02:00 |
|
Cedric Nugteren
|
bbb4523b7c
|
Added reference implementation for xCONVGEMM for half-precision
|
2018-09-07 22:04:08 +02:00 |
|
Cedric Nugteren
|
391e5757bd
|
Fixed the tests of OMATCOPY to include proper complex conjugation
|
2018-07-31 21:44:28 +02:00 |
|
Cedric Nugteren
|
cbcd4ff7e8
|
Merge branch 'master' into CLBlast-267-convgemm
|
2018-05-19 17:54:27 +02:00 |
|
Cedric Nugteren
|
8290ad78b9
|
Fixed a few issues with canary region testing
|
2018-05-17 12:16:32 +02:00 |
|
Cedric Nugteren
|
b608280361
|
Fixed the performance client for convgemm and added GFLOPS measurements
|
2018-05-09 19:59:31 +02:00 |
|
Cedric Nugteren
|
2d1f6ba7fe
|
Added convgemm skeleton, test infrastructure, and first reference implementation
|
2018-05-06 11:35:34 +02:00 |
|
Cedric Nugteren
|
69ed46c8da
|
Implemented the XHAD Hadamard product routine
|
2018-02-02 21:18:37 +01:00 |
|
Cedric Nugteren
|
ef5008f5e4
|
Created the API and stubs for the HAD (hadamard-product) routines
|
2018-01-31 20:41:02 +01:00 |
|
Cedric Nugteren
|
9fb2c61b25
|
Added API and tests for new GemmStridedBatched routine
|
2018-01-07 14:27:15 +01:00 |
|
Cedric Nugteren
|
00687f8d81
|
Prevented half-precision batched routines from failing in the tests
|
2018-01-06 19:26:38 +01:00 |
|
Cedric Nugteren
|
ce069545d4
|
Added CUDA interface to get temporary-buffer size for GEMM routine
|
2018-01-06 10:05:28 +01:00 |
|
Cedric Nugteren
|
5315b982a9
|
Added the temp-buffer to the GEMM testers and clients
|
2018-01-03 20:20:31 +01:00 |
|
Cedric Nugteren
|
eb89371d2b
|
Added a queue argument to the get-size function when running the tests/clients
|
2018-01-03 20:19:45 +01:00 |
|
Cedric Nugteren
|
ef71d8e9b5
|
Fixed unused variable warnings showing up with Clang
|
2017-12-23 16:07:26 +01:00 |
|
Cedric Nugteren
|
5467c0cac5
|
Fixed a variety of warnings and an error for MSVC2013 compilation
|
2017-11-19 21:09:24 +01:00 |
|
Cedric Nugteren
|
d24138808b
|
Fixed an FP16 issue in the homatcopy test; added a comment about improper testing of integer returning functions for FP16
|
2017-11-08 21:20:07 +01:00 |
|
Cedric Nugteren
|
9b0a435fb0
|
Integrated the GEMM routine tuner for kernel selection; added first tuning results
|
2017-11-02 21:47:14 +01:00 |
|
Cedric Nugteren
|
e388f055f7
|
Fixed small bug in (unused) invert tester
|
2017-10-25 20:35:39 +02:00 |
|
Cedric Nugteren
|
8431a165d0
|
Fixed a small copy-paste typo
|
2017-10-15 19:38:48 +02:00 |
|
Cedric Nugteren
|
e6da575fff
|
Modified test interfaces such that they support either OpenCL or CUDA
|
2017-10-15 19:35:21 +02:00 |
|
Cedric Nugteren
|
7663cba234
|
Fixes for the CUDA API: first tests pass and the client runs
|
2017-10-15 17:43:20 +02:00 |
|
Cedric Nugteren
|
a3069a97c3
|
Prepared test and client infrastructure for use with the CUDA API
|
2017-10-15 13:56:19 +02:00 |
|
Cedric Nugteren
|
74fd6767b9
|
GEMM tests now test both the in-direct and the direct kernels seperately
|
2017-10-01 20:36:56 +02:00 |
|
Cedric Nugteren
|
6194d43efb
|
Fixed a bug in im2col confusing first and second workgroup size; made im2col kernel 2d instead of 3d
|
2017-08-31 20:34:10 +02:00 |
|
Cedric Nugteren
|
a8c26594d9
|
Made the im2col client properly handle the arguments
|
2017-08-23 19:54:09 +02:00 |
|
Cedric Nugteren
|
132e62892d
|
Implemented proper im2col reference function and completd tests
|
2017-08-19 16:55:09 +02:00 |
|
Cedric Nugteren
|
777681dcbd
|
Merge branch 'master' into im_to_col
|
2017-08-12 20:50:00 +02:00 |
|
Cedric Nugteren
|
844e68853e
|
Moved some utility functions to a test-specific utility compilation-unit
|
2017-08-12 15:38:17 +02:00 |
|
Cedric Nugteren
|
97bcf77d4b
|
First step towards supporting im2col in the test infrastructure
|
2017-07-16 22:33:49 +02:00 |
|
Cedric Nugteren
|
f77b48692b
|
Relaxed requirement on a_ld and b_ld for batched GEMM
|
2017-07-12 21:53:39 +02:00 |
|
Cedric Nugteren
|
ce528a9d39
|
Fixed and suppresses several warnings for MSVC
|
2017-06-26 21:38:04 +02:00 |
|
Cedric Nugteren
|
93c8db7fe7
|
Bug-fix in the half-precision test of the amax routine
|
2017-05-11 22:19:15 -07:00 |
|
Cedric Nugteren
|
049d0fc95a
|
Fixed a compiler warning message
|
2017-04-23 10:45:08 +02:00 |
|
Cedric Nugteren
|
f7f8ec644f
|
Fixed CUDA malloc and cuBLAS handles: cuBLAS as a performance-reference now works
|
2017-04-13 21:31:27 +02:00 |
|
Cedric Nugteren
|
f24c142948
|
Made compilation of the cuBLAS wrapper work properly
|
2017-04-11 21:50:18 +02:00 |
|
Cedric Nugteren
|
6b625f8915
|
Added reference implementations for performance-testing against cuBLAS
|
2017-04-10 22:54:14 +02:00 |
|
Cedric Nugteren
|
af9a521042
|
Fixes the CUDA wrapper (now actually tested on a system with CUDA)
|
2017-04-03 21:46:07 +02:00 |
|
Cedric Nugteren
|
c5461d77e5
|
Factored out inclusion of clBLAS and CBLAS from the test-routine files
|
2017-04-02 15:24:21 +02:00 |
|
Cedric Nugteren
|
a9c25e9fd2
|
Factored out inclusion of clBLAS and CBLAS from the test-routine files
|
2017-04-02 15:21:19 +02:00 |
|
Cedric Nugteren
|
b84d2296b8
|
Separated host-device and device-host memory copies from execution of the CBLAS reference code; for fair timing and code de-duplication
|
2017-04-01 13:36:24 +02:00 |
|
Cedric Nugteren
|
49e04c7fce
|
Added API and test infrastructure for the batched GEMM routine
|
2017-03-10 21:24:35 +01:00 |
|
Cedric Nugteren
|
3846f44eaf
|
Small fix for a file that isn't currently compiled anymore
|
2017-03-10 20:53:20 +01:00 |
|