Cedric Nugteren
|
5467c0cac5
|
Fixed a variety of warnings and an error for MSVC2013 compilation
|
2017-11-19 21:09:24 +01:00 |
|
Cedric Nugteren
|
4bac1287f2
|
Moved square-difference utility function for use in the tuners
|
2017-11-13 21:10:44 +01:00 |
|
Cedric Nugteren
|
d24138808b
|
Fixed an FP16 issue in the homatcopy test; added a comment about improper testing of integer returning functions for FP16
|
2017-11-08 21:20:07 +01:00 |
|
Cedric Nugteren
|
b18cc9d3f1
|
Merge pull request #212 from CNugteren/kernel_selection_tuner
GEMM kernel selection tuner
|
2017-11-07 22:20:13 +01:00 |
|
Cedric Nugteren
|
9b0a435fb0
|
Integrated the GEMM routine tuner for kernel selection; added first tuning results
|
2017-11-02 21:47:14 +01:00 |
|
Cedric Nugteren
|
12b08ae491
|
Merge branch 'master' into android_support
|
2017-10-28 17:32:37 +02:00 |
|
Cedric Nugteren
|
bd57dfa435
|
Moved timing function to a separate file
|
2017-10-28 14:12:05 +02:00 |
|
Cedric Nugteren
|
e388f055f7
|
Fixed small bug in (unused) invert tester
|
2017-10-25 20:35:39 +02:00 |
|
Cedric Nugteren
|
9d879c949a
|
Fix an incompatibility with CUDA's FP16 definition
|
2017-10-17 20:29:23 +02:00 |
|
Cedric Nugteren
|
8431a165d0
|
Fixed a small copy-paste typo
|
2017-10-15 19:38:48 +02:00 |
|
Cedric Nugteren
|
e6da575fff
|
Modified test interfaces such that they support either OpenCL or CUDA
|
2017-10-15 19:35:21 +02:00 |
|
Cedric Nugteren
|
7663cba234
|
Fixes for the CUDA API: first tests pass and the client runs
|
2017-10-15 17:43:20 +02:00 |
|
Cedric Nugteren
|
a3069a97c3
|
Prepared test and client infrastructure for use with the CUDA API
|
2017-10-15 13:56:19 +02:00 |
|
Cedric Nugteren
|
9224da19ef
|
Fixed the Python generator script w.r.t. the recent change of testing direct/in-direct GEMM kernels separately
|
2017-10-09 20:06:25 +02:00 |
|
Cedric Nugteren
|
3598762029
|
Moved the remaining OpenCL specific host code to the clpp11.h header where it belongs
|
2017-10-08 10:29:47 +02:00 |
|
Cedric Nugteren
|
6d3e1212f0
|
Synchronizes clpp11.h with CLCudaAPI 9.0
|
2017-10-07 18:43:29 +02:00 |
|
Cedric Nugteren
|
74fd6767b9
|
GEMM tests now test both the in-direct and the direct kernels seperately
|
2017-10-01 20:36:56 +02:00 |
|
Cedric Nugteren
|
21af690472
|
Added missing headers
|
2017-09-26 21:17:55 +02:00 |
|
Cedric Nugteren
|
ed980a1df1
|
Updated database override function to work with the new database storage format
|
2017-09-24 15:44:14 +02:00 |
|
Cedric Nugteren
|
2df9f21ab8
|
Added extra benchmarks to verify new database caching keys performance
|
2017-09-23 18:06:43 +02:00 |
|
Cedric Nugteren
|
890281f3e8
|
Made database-caching no longer dependent on device name but on device/platform IDs
|
2017-09-23 17:50:44 +02:00 |
|
Cedric Nugteren
|
65c492edf6
|
Added OpenCL properties printing to the diagnostics helper
|
2017-09-22 21:35:32 +02:00 |
|
Cedric Nugteren
|
2ef6578961
|
Added first version of a small CLBlast diagnostics helper
|
2017-09-19 21:43:35 +02:00 |
|
Cedric Nugteren
|
6194d43efb
|
Fixed a bug in im2col confusing first and second workgroup size; made im2col kernel 2d instead of 3d
|
2017-08-31 20:34:10 +02:00 |
|
Cedric Nugteren
|
a8c26594d9
|
Made the im2col client properly handle the arguments
|
2017-08-23 19:54:09 +02:00 |
|
Cedric Nugteren
|
132e62892d
|
Implemented proper im2col reference function and completd tests
|
2017-08-19 16:55:09 +02:00 |
|
Cedric Nugteren
|
777681dcbd
|
Merge branch 'master' into im_to_col
|
2017-08-12 20:50:00 +02:00 |
|
Cedric Nugteren
|
844e68853e
|
Moved some utility functions to a test-specific utility compilation-unit
|
2017-08-12 15:38:17 +02:00 |
|
Cedric Nugteren
|
97bcf77d4b
|
First step towards supporting im2col in the test infrastructure
|
2017-07-16 22:33:49 +02:00 |
|
Cedric Nugteren
|
de9ed9d4ea
|
Fixed batched tests when testing for invalid sizes against clBLAS
|
2017-07-12 21:54:16 +02:00 |
|
Cedric Nugteren
|
f77b48692b
|
Relaxed requirement on a_ld and b_ld for batched GEMM
|
2017-07-12 21:53:39 +02:00 |
|
Cedric Nugteren
|
d4c8a7c8b0
|
Changed printf-statements with %zu into std::cout to fix MSVC 2013 compatibility
|
2017-07-09 20:19:08 +02:00 |
|
Cedric Nugteren
|
4b415bdf3c
|
Disabled UNIX-style terminal color printing under Windows
|
2017-07-09 20:04:13 +02:00 |
|
Cedric Nugteren
|
4e51b1e1f8
|
Moved and inlined some static member variables and disabled spurious clang warnings
|
2017-06-27 21:05:16 +02:00 |
|
Cedric Nugteren
|
e60b10529a
|
Undo of earlier move of TestBlas::kTransposes constant to fix MSVC 2013 compilation
|
2017-06-27 20:59:28 +02:00 |
|
Cedric Nugteren
|
ce528a9d39
|
Fixed and suppresses several warnings for MSVC
|
2017-06-26 21:38:04 +02:00 |
|
Cedric Nugteren
|
19504ed609
|
Moved static variable declarations from .cpp to .hpp to resolve some Clang warnings
|
2017-06-25 20:59:22 +02:00 |
|
Cedric Nugteren
|
1a8ed48a35
|
Fixed some Clang and MSVC warnings
|
2017-06-25 11:50:36 +02:00 |
|
Cedric Nugteren
|
93c8db7fe7
|
Bug-fix in the half-precision test of the amax routine
|
2017-05-11 22:19:15 -07:00 |
|
Cedric Nugteren
|
049d0fc95a
|
Fixed a compiler warning message
|
2017-04-23 10:45:08 +02:00 |
|
Cedric Nugteren
|
409a5a2ad0
|
Fixed a namespace clash with CUDA FP16 for the half-datatype
|
2017-04-17 16:47:15 +02:00 |
|
Cedric Nugteren
|
2673f50518
|
Merge branch 'development' into benchmarking
|
2017-04-16 19:41:14 +02:00 |
|
Cedric Nugteren
|
e3bb58f602
|
Finalized support for performance testing against cuBLAS
|
2017-04-16 17:53:51 +02:00 |
|
Cedric Nugteren
|
10205d773e
|
Added a new Xaxpy kernel in between the regular and fast version in
|
2017-04-14 20:16:10 +02:00 |
|
Cedric Nugteren
|
f7f8ec644f
|
Fixed CUDA malloc and cuBLAS handles: cuBLAS as a performance-reference now works
|
2017-04-13 21:31:27 +02:00 |
|
Cedric Nugteren
|
f24c142948
|
Made compilation of the cuBLAS wrapper work properly
|
2017-04-11 21:50:18 +02:00 |
|
Cedric Nugteren
|
6b625f8915
|
Added reference implementations for performance-testing against cuBLAS
|
2017-04-10 22:54:14 +02:00 |
|
Cedric Nugteren
|
52dd7433ca
|
Completed the cuBLAS wrapper
|
2017-04-06 20:56:28 +02:00 |
|
Cedric Nugteren
|
dbe22b5bf3
|
Fixed some size_t to int conversion warnings for the CBLAS interface
|
2017-04-06 19:40:51 +02:00 |
|
Cedric Nugteren
|
674ff96fdf
|
Added a first version of a cuBLAS wrapper (WIP)
|
2017-04-05 21:27:25 +02:00 |
|