Commit graph

273 commits

Author SHA1 Message Date
Cedric Nugteren bf7aeb8d5b Improved the pre-processor's handling of defines; added a special nested defines test 2017-11-30 21:43:16 +01:00
Cedric Nugteren 13eb772343 Integrated pre-processor in compilation flow, default is still disabled 2017-11-30 21:32:47 +01:00
Cedric Nugteren 0dde6af703 Extended the preprocessor tests to include CopyFast and CopyPad 2017-11-29 20:18:36 +01:00
Cedric Nugteren 426406668e Improved the pre-processor tester, added GEMV and GER kernels 2017-11-28 20:52:47 +01:00
Cedric Nugteren f01bcded1e Moved string splitting functions; added string character removal function 2017-11-25 17:44:21 +01:00
Cedric Nugteren c0c6d00b12 Added stub for a preprocessor and a corresponding compilation test 2017-11-25 10:24:05 +01:00
Cedric Nugteren d7b29d864a Fixed a Clang compilation error 2017-11-24 21:41:45 +01:00
Cedric Nugteren a5cef9ef3b Added missing include file 2017-11-24 21:11:52 +01:00
Cedric Nugteren a768b7686b Added precision check to parameter override for the clients 2017-11-24 21:09:39 +01:00
Cedric Nugteren 9527c89c30 Made parameter override in the clients a command-line argument and added support for multi-kernel routines 2017-11-22 20:53:20 +01:00
Cedric Nugteren 8c9ecd9736 Implemented first version of reading JSON files from disk in the client to override parameters 2017-11-21 22:05:08 +01:00
Cedric Nugteren 5467c0cac5 Fixed a variety of warnings and an error for MSVC2013 compilation 2017-11-19 21:09:24 +01:00
Cedric Nugteren 4bac1287f2 Moved square-difference utility function for use in the tuners 2017-11-13 21:10:44 +01:00
Cedric Nugteren d24138808b Fixed an FP16 issue in the homatcopy test; added a comment about improper testing of integer returning functions for FP16 2017-11-08 21:20:07 +01:00
Cedric Nugteren b18cc9d3f1
Merge pull request #212 from CNugteren/kernel_selection_tuner
GEMM kernel selection tuner
2017-11-07 22:20:13 +01:00
Cedric Nugteren 9b0a435fb0 Integrated the GEMM routine tuner for kernel selection; added first tuning results 2017-11-02 21:47:14 +01:00
Cedric Nugteren 12b08ae491 Merge branch 'master' into android_support 2017-10-28 17:32:37 +02:00
Cedric Nugteren bd57dfa435 Moved timing function to a separate file 2017-10-28 14:12:05 +02:00
Cedric Nugteren e388f055f7 Fixed small bug in (unused) invert tester 2017-10-25 20:35:39 +02:00
Cedric Nugteren 9d879c949a Fix an incompatibility with CUDA's FP16 definition 2017-10-17 20:29:23 +02:00
Cedric Nugteren 8431a165d0 Fixed a small copy-paste typo 2017-10-15 19:38:48 +02:00
Cedric Nugteren e6da575fff Modified test interfaces such that they support either OpenCL or CUDA 2017-10-15 19:35:21 +02:00
Cedric Nugteren 7663cba234 Fixes for the CUDA API: first tests pass and the client runs 2017-10-15 17:43:20 +02:00
Cedric Nugteren a3069a97c3 Prepared test and client infrastructure for use with the CUDA API 2017-10-15 13:56:19 +02:00
Cedric Nugteren 9224da19ef Fixed the Python generator script w.r.t. the recent change of testing direct/in-direct GEMM kernels separately 2017-10-09 20:06:25 +02:00
Cedric Nugteren 3598762029 Moved the remaining OpenCL specific host code to the clpp11.h header where it belongs 2017-10-08 10:29:47 +02:00
Cedric Nugteren 6d3e1212f0 Synchronizes clpp11.h with CLCudaAPI 9.0 2017-10-07 18:43:29 +02:00
Cedric Nugteren 74fd6767b9 GEMM tests now test both the in-direct and the direct kernels seperately 2017-10-01 20:36:56 +02:00
Cedric Nugteren 21af690472 Added missing headers 2017-09-26 21:17:55 +02:00
Cedric Nugteren ed980a1df1 Updated database override function to work with the new database storage format 2017-09-24 15:44:14 +02:00
Cedric Nugteren 2df9f21ab8 Added extra benchmarks to verify new database caching keys performance 2017-09-23 18:06:43 +02:00
Cedric Nugteren 890281f3e8 Made database-caching no longer dependent on device name but on device/platform IDs 2017-09-23 17:50:44 +02:00
Cedric Nugteren 65c492edf6 Added OpenCL properties printing to the diagnostics helper 2017-09-22 21:35:32 +02:00
Cedric Nugteren 2ef6578961 Added first version of a small CLBlast diagnostics helper 2017-09-19 21:43:35 +02:00
Cedric Nugteren 6194d43efb Fixed a bug in im2col confusing first and second workgroup size; made im2col kernel 2d instead of 3d 2017-08-31 20:34:10 +02:00
Cedric Nugteren a8c26594d9 Made the im2col client properly handle the arguments 2017-08-23 19:54:09 +02:00
Cedric Nugteren 132e62892d Implemented proper im2col reference function and completd tests 2017-08-19 16:55:09 +02:00
Cedric Nugteren 777681dcbd Merge branch 'master' into im_to_col 2017-08-12 20:50:00 +02:00
Cedric Nugteren 844e68853e Moved some utility functions to a test-specific utility compilation-unit 2017-08-12 15:38:17 +02:00
Cedric Nugteren 97bcf77d4b First step towards supporting im2col in the test infrastructure 2017-07-16 22:33:49 +02:00
Cedric Nugteren de9ed9d4ea Fixed batched tests when testing for invalid sizes against clBLAS 2017-07-12 21:54:16 +02:00
Cedric Nugteren f77b48692b Relaxed requirement on a_ld and b_ld for batched GEMM 2017-07-12 21:53:39 +02:00
Cedric Nugteren d4c8a7c8b0 Changed printf-statements with %zu into std::cout to fix MSVC 2013 compatibility 2017-07-09 20:19:08 +02:00
Cedric Nugteren 4b415bdf3c Disabled UNIX-style terminal color printing under Windows 2017-07-09 20:04:13 +02:00
Cedric Nugteren 4e51b1e1f8 Moved and inlined some static member variables and disabled spurious clang warnings 2017-06-27 21:05:16 +02:00
Cedric Nugteren e60b10529a Undo of earlier move of TestBlas::kTransposes constant to fix MSVC 2013 compilation 2017-06-27 20:59:28 +02:00
Cedric Nugteren ce528a9d39 Fixed and suppresses several warnings for MSVC 2017-06-26 21:38:04 +02:00
Cedric Nugteren 19504ed609 Moved static variable declarations from .cpp to .hpp to resolve some Clang warnings 2017-06-25 20:59:22 +02:00
Cedric Nugteren 1a8ed48a35 Fixed some Clang and MSVC warnings 2017-06-25 11:50:36 +02:00
Cedric Nugteren 93c8db7fe7 Bug-fix in the half-precision test of the amax routine 2017-05-11 22:19:15 -07:00
Cedric Nugteren 049d0fc95a Fixed a compiler warning message 2017-04-23 10:45:08 +02:00
Cedric Nugteren 409a5a2ad0 Fixed a namespace clash with CUDA FP16 for the half-datatype 2017-04-17 16:47:15 +02:00
Cedric Nugteren 2673f50518 Merge branch 'development' into benchmarking 2017-04-16 19:41:14 +02:00
Cedric Nugteren e3bb58f602 Finalized support for performance testing against cuBLAS 2017-04-16 17:53:51 +02:00
Cedric Nugteren 10205d773e Added a new Xaxpy kernel in between the regular and fast version in 2017-04-14 20:16:10 +02:00
Cedric Nugteren f7f8ec644f Fixed CUDA malloc and cuBLAS handles: cuBLAS as a performance-reference now works 2017-04-13 21:31:27 +02:00
Cedric Nugteren f24c142948 Made compilation of the cuBLAS wrapper work properly 2017-04-11 21:50:18 +02:00
Cedric Nugteren 6b625f8915 Added reference implementations for performance-testing against cuBLAS 2017-04-10 22:54:14 +02:00
Cedric Nugteren 52dd7433ca Completed the cuBLAS wrapper 2017-04-06 20:56:28 +02:00
Cedric Nugteren dbe22b5bf3 Fixed some size_t to int conversion warnings for the CBLAS interface 2017-04-06 19:40:51 +02:00
Cedric Nugteren 674ff96fdf Added a first version of a cuBLAS wrapper (WIP) 2017-04-05 21:27:25 +02:00
Cedric Nugteren af9a521042 Fixes the CUDA wrapper (now actually tested on a system with CUDA) 2017-04-03 21:46:07 +02:00
Cedric Nugteren eb1fda2729 In-lined the float2 and double2 types to avoid collision with CUDA's definitions 2017-04-03 21:44:35 +02:00
Cedric Nugteren b24d364743 Layed the groundwork for cuBLAS comparisons in the clients 2017-04-02 18:06:15 +02:00
Cedric Nugteren c5461d77e5 Factored out inclusion of clBLAS and CBLAS from the test-routine files 2017-04-02 15:24:21 +02:00
Cedric Nugteren a9c25e9fd2 Factored out inclusion of clBLAS and CBLAS from the test-routine files 2017-04-02 15:21:19 +02:00
Cedric Nugteren b84d2296b8 Separated host-device and device-host memory copies from execution of the CBLAS reference code; for fair timing and code de-duplication 2017-04-01 13:36:24 +02:00
Cedric Nugteren a98c00a267 Fixed a GCC/MSVC compilation issue 2017-03-20 19:53:55 +01:00
Cedric Nugteren 0610447a7a Fixed a compilation issue for GCC/MSVC 2017-03-19 17:37:52 +01:00
Cedric Nugteren 068ff32e9f Fixed a linker issue for Clang 2017-03-12 10:41:18 +01:00
Cedric Nugteren 49e04c7fce Added API and test infrastructure for the batched GEMM routine 2017-03-10 21:24:35 +01:00
Cedric Nugteren 3846f44eaf Small fix for a file that isn't currently compiled anymore 2017-03-10 20:53:20 +01:00
Cedric Nugteren d754586b49 Added proper testing of the alpha parameter; finalized the batched AXPY implementation 2017-03-10 20:49:59 +01:00
Cedric Nugteren fa0a9c689f Make batched routines based on offsets instead of a vector of cl_mem objects - undoing many earlier changes 2017-03-08 20:10:20 +01:00
Cedric Nugteren 6aba0bbae7 Minor fixes to the client w.r.t. the addition of the batch count 2017-03-05 16:44:16 +01:00
Cedric Nugteren b114ea49a9 Added first naive version of the batched AXPY routine 2017-03-05 15:06:14 +01:00
Cedric Nugteren cdf354f895 Adjusted the test-infrastructure to support testing of batched-versions of routines 2017-03-05 15:04:16 +01:00
Cedric Nugteren 7f14b11f1e Changed the way the test-data is generated: now using a single MT generator and distribution for all data 2017-03-05 11:13:47 +01:00
Cedric Nugteren f9a520b3af Prepared generator for batched routines; added batched AXPY routine interface 2017-03-05 10:38:38 +01:00
Cedric Nugteren 37228c9098 Fixed a missing include for the tests 2017-03-04 20:45:39 +01:00
Cedric Nugteren e993ee077b Added a proper data-preparation function for the TRSM tests 2017-03-04 15:21:33 +01:00
Cedric Nugteren e8d5923d27 Made a double to float cast explicit for MSVC compatibility (C2397) 2017-03-01 20:42:06 +01:00
Cedric Nugteren d6f1b5fca3 Added L2 error computation and checking for half-precision tests 2017-02-27 21:49:20 +01:00
Cedric Nugteren ea6790665d Merge branch 'development' into triangular_solvers 2017-02-26 14:51:45 +01:00
Cedric Nugteren a145890aaa Added a guard against invalid buffer sizes in the prepare-data functions for tests 2017-02-26 14:37:29 +01:00
Cedric Nugteren b7310036ed Removed half-precision support from the TRSM routine; too unstable 2017-02-26 12:56:21 +01:00
Cedric Nugteren 70d8c4bad7 Improved the correctness tests for complex numbers in case either real or imag is much larger than the other 2017-02-26 10:19:53 +01:00
Cedric Nugteren e47d95887c Added PrepareData function for TRSM to create proper test input 2017-02-25 12:23:04 +01:00
Cedric Nugteren 133ebfc834 Added data-preparation function for the TRSV tests and special nan/inf checks in the error checking to make the tests pass 2017-02-19 17:43:26 +01:00
Cedric Nugteren 7b2170818f Changed the override-parameters test such that it is compatible with more devices 2017-02-18 11:22:07 +01:00
Cedric Nugteren bdc57221bd Added simple tests for the OverrideParameters function 2017-02-14 21:09:00 +01:00
Cedric Nugteren c248f900c0 Merge branch 'development' into triangular_solvers 2017-02-05 22:18:59 +01:00
Cedric Nugteren e7cbb5915a Fixed complex version of the TRSV kernel 2017-02-05 14:36:31 +01:00
Ivan Shapovalov 064ba4abd4 treewide: silence type mismatch warnings in *printf() 2017-01-24 02:55:09 +03:00
Ivan Shapovalov 519ccbd273 Tester: always fail on OpenCL and CLBlast internal errors
These errors are self-evident and enough to fail the test even if there is
no clBLAS reference to compare error codes with.
2017-01-24 02:55:09 +03:00
Ivan Shapovalov 1a1e863ab3 treewide: include clpp11.hpp first to silence deprecation warnings
Otherwise, cl.h gets included through clblast.h before clpp11.hpp.
2017-01-20 17:32:42 +03:00
Cedric Nugteren a5fd2323b6 Added prototype for the TRSV routine 2017-01-20 11:30:32 +01:00
Cedric Nugteren df9a77d74d Added first version of the TRSM routine based on the diagonal invert kernel 2017-01-18 21:29:59 +01:00
Cedric Nugteren 4b3ffd9989 Added a first version of the diagonal block invert routine in preparation of TRSM 2017-01-15 17:30:00 +01:00
Cedric Nugteren 4a4be0c3a5 Prints additional information in verbose/debug mode 2017-01-15 17:17:40 +01:00