Cedric Nugteren
221121b840
Add Github Actions CI ( #464 )
...
This replaces the old Travis CI builds with Github Actions that test on both Ubuntu and MacOS, with both Clang and GCC. The builds on macOS also run the tests and some other programs, on Ubuntu OpenCL is not working at the moment. Because these tests use new/different compilers, I fixed a few warnings and errors along the way.
2023-05-14 11:25:15 +02:00
Cedric Nugteren
3d0c227fa5
AMAX/AMIN integer testing and bug fixes ( #457 )
...
* Fixed a bug in XAMAX/XMIN routines that caused the increment and offset to be included in the result
* Perform proper integer-output testing in XAMAX tests
* A few changes towards getting it ready for a PR
* Also fix compilation for clBLAS and cuBLAS references
* Fix a bug that would only use the real part of complex numbers in the amax/amin routines
* A few small fixes related to the AMAX tests
2023-05-07 20:02:52 +02:00
JishinMaster
aec45ea637
set the correct flop count for xgemm
2021-03-13 21:48:04 +01:00
Koichi Akabe
d9db543d75
Fix half-float+kernel_mode test cases of im2col, col2im, and convgemm
2018-12-17 21:57:35 +09:00
Koichi Akabe
032e3b0cc0
Add kernel_mode option to im2col, col2im, and convgemm functions
2018-11-12 10:12:07 +09:00
Cedric Nugteren
6f67525ea6
Changed col2im to append to the existing im-buffer
2018-11-07 19:45:07 +01:00
Cedric Nugteren
469c346a8e
Fixed half-precision tests for im2col and col2im
2018-11-01 21:44:21 +01:00
Koichi Akabe
0b3d04f709
Fix col2im implementation
2018-10-30 14:54:55 +09:00
Cedric Nugteren
d45911b61d
Added groundwork for col2im algorithm plus first non-working version of kernel and test
2018-10-23 20:52:25 +02:00
Cedric Nugteren
44b630fc22
Some name changes in im2col code
2018-10-22 22:12:58 +02:00
Cedric Nugteren
83ba3d4b7b
Merge branch 'master' into convgemm_multi_kernel
2018-09-16 20:01:18 +02:00
Cedric Nugteren
bbb4523b7c
Added reference implementation for xCONVGEMM for half-precision
2018-09-07 22:04:08 +02:00
Cedric Nugteren
391e5757bd
Fixed the tests of OMATCOPY to include proper complex conjugation
2018-07-31 21:44:28 +02:00
Cedric Nugteren
cbcd4ff7e8
Merge branch 'master' into CLBlast-267-convgemm
2018-05-19 17:54:27 +02:00
Cedric Nugteren
8290ad78b9
Fixed a few issues with canary region testing
2018-05-17 12:16:32 +02:00
Cedric Nugteren
b608280361
Fixed the performance client for convgemm and added GFLOPS measurements
2018-05-09 19:59:31 +02:00
Cedric Nugteren
2d1f6ba7fe
Added convgemm skeleton, test infrastructure, and first reference implementation
2018-05-06 11:35:34 +02:00
Cedric Nugteren
69ed46c8da
Implemented the XHAD Hadamard product routine
2018-02-02 21:18:37 +01:00
Cedric Nugteren
ef5008f5e4
Created the API and stubs for the HAD (hadamard-product) routines
2018-01-31 20:41:02 +01:00
Cedric Nugteren
9fb2c61b25
Added API and tests for new GemmStridedBatched routine
2018-01-07 14:27:15 +01:00
Cedric Nugteren
00687f8d81
Prevented half-precision batched routines from failing in the tests
2018-01-06 19:26:38 +01:00
Cedric Nugteren
ce069545d4
Added CUDA interface to get temporary-buffer size for GEMM routine
2018-01-06 10:05:28 +01:00
Cedric Nugteren
5315b982a9
Added the temp-buffer to the GEMM testers and clients
2018-01-03 20:20:31 +01:00
Cedric Nugteren
eb89371d2b
Added a queue argument to the get-size function when running the tests/clients
2018-01-03 20:19:45 +01:00
Cedric Nugteren
ef71d8e9b5
Fixed unused variable warnings showing up with Clang
2017-12-23 16:07:26 +01:00
Cedric Nugteren
5467c0cac5
Fixed a variety of warnings and an error for MSVC2013 compilation
2017-11-19 21:09:24 +01:00
Cedric Nugteren
d24138808b
Fixed an FP16 issue in the homatcopy test; added a comment about improper testing of integer returning functions for FP16
2017-11-08 21:20:07 +01:00
Cedric Nugteren
9b0a435fb0
Integrated the GEMM routine tuner for kernel selection; added first tuning results
2017-11-02 21:47:14 +01:00
Cedric Nugteren
e388f055f7
Fixed small bug in (unused) invert tester
2017-10-25 20:35:39 +02:00
Cedric Nugteren
8431a165d0
Fixed a small copy-paste typo
2017-10-15 19:38:48 +02:00
Cedric Nugteren
e6da575fff
Modified test interfaces such that they support either OpenCL or CUDA
2017-10-15 19:35:21 +02:00
Cedric Nugteren
7663cba234
Fixes for the CUDA API: first tests pass and the client runs
2017-10-15 17:43:20 +02:00
Cedric Nugteren
a3069a97c3
Prepared test and client infrastructure for use with the CUDA API
2017-10-15 13:56:19 +02:00
Cedric Nugteren
74fd6767b9
GEMM tests now test both the in-direct and the direct kernels seperately
2017-10-01 20:36:56 +02:00
Cedric Nugteren
6194d43efb
Fixed a bug in im2col confusing first and second workgroup size; made im2col kernel 2d instead of 3d
2017-08-31 20:34:10 +02:00
Cedric Nugteren
a8c26594d9
Made the im2col client properly handle the arguments
2017-08-23 19:54:09 +02:00
Cedric Nugteren
132e62892d
Implemented proper im2col reference function and completd tests
2017-08-19 16:55:09 +02:00
Cedric Nugteren
777681dcbd
Merge branch 'master' into im_to_col
2017-08-12 20:50:00 +02:00
Cedric Nugteren
844e68853e
Moved some utility functions to a test-specific utility compilation-unit
2017-08-12 15:38:17 +02:00
Cedric Nugteren
97bcf77d4b
First step towards supporting im2col in the test infrastructure
2017-07-16 22:33:49 +02:00
Cedric Nugteren
f77b48692b
Relaxed requirement on a_ld and b_ld for batched GEMM
2017-07-12 21:53:39 +02:00
Cedric Nugteren
ce528a9d39
Fixed and suppresses several warnings for MSVC
2017-06-26 21:38:04 +02:00
Cedric Nugteren
93c8db7fe7
Bug-fix in the half-precision test of the amax routine
2017-05-11 22:19:15 -07:00
Cedric Nugteren
049d0fc95a
Fixed a compiler warning message
2017-04-23 10:45:08 +02:00
Cedric Nugteren
f7f8ec644f
Fixed CUDA malloc and cuBLAS handles: cuBLAS as a performance-reference now works
2017-04-13 21:31:27 +02:00
Cedric Nugteren
f24c142948
Made compilation of the cuBLAS wrapper work properly
2017-04-11 21:50:18 +02:00
Cedric Nugteren
6b625f8915
Added reference implementations for performance-testing against cuBLAS
2017-04-10 22:54:14 +02:00
Cedric Nugteren
af9a521042
Fixes the CUDA wrapper (now actually tested on a system with CUDA)
2017-04-03 21:46:07 +02:00
Cedric Nugteren
c5461d77e5
Factored out inclusion of clBLAS and CBLAS from the test-routine files
2017-04-02 15:24:21 +02:00
Cedric Nugteren
a9c25e9fd2
Factored out inclusion of clBLAS and CBLAS from the test-routine files
2017-04-02 15:21:19 +02:00