Cedric Nugteren
|
a3069a97c3
|
Prepared test and client infrastructure for use with the CUDA API
|
2017-10-15 13:56:19 +02:00 |
|
Cedric Nugteren
|
9224da19ef
|
Fixed the Python generator script w.r.t. the recent change of testing direct/in-direct GEMM kernels separately
|
2017-10-09 20:06:25 +02:00 |
|
Cedric Nugteren
|
74fd6767b9
|
GEMM tests now test both the in-direct and the direct kernels seperately
|
2017-10-01 20:36:56 +02:00 |
|
Cedric Nugteren
|
ed980a1df1
|
Updated database override function to work with the new database storage format
|
2017-09-24 15:44:14 +02:00 |
|
Cedric Nugteren
|
890281f3e8
|
Made database-caching no longer dependent on device name but on device/platform IDs
|
2017-09-23 17:50:44 +02:00 |
|
Cedric Nugteren
|
132e62892d
|
Implemented proper im2col reference function and completd tests
|
2017-08-19 16:55:09 +02:00 |
|
Cedric Nugteren
|
777681dcbd
|
Merge branch 'master' into im_to_col
|
2017-08-12 20:50:00 +02:00 |
|
Cedric Nugteren
|
844e68853e
|
Moved some utility functions to a test-specific utility compilation-unit
|
2017-08-12 15:38:17 +02:00 |
|
Cedric Nugteren
|
97bcf77d4b
|
First step towards supporting im2col in the test infrastructure
|
2017-07-16 22:33:49 +02:00 |
|
Cedric Nugteren
|
de9ed9d4ea
|
Fixed batched tests when testing for invalid sizes against clBLAS
|
2017-07-12 21:54:16 +02:00 |
|
Cedric Nugteren
|
d4c8a7c8b0
|
Changed printf-statements with %zu into std::cout to fix MSVC 2013 compatibility
|
2017-07-09 20:19:08 +02:00 |
|
Cedric Nugteren
|
4b415bdf3c
|
Disabled UNIX-style terminal color printing under Windows
|
2017-07-09 20:04:13 +02:00 |
|
Cedric Nugteren
|
4e51b1e1f8
|
Moved and inlined some static member variables and disabled spurious clang warnings
|
2017-06-27 21:05:16 +02:00 |
|
Cedric Nugteren
|
e60b10529a
|
Undo of earlier move of TestBlas::kTransposes constant to fix MSVC 2013 compilation
|
2017-06-27 20:59:28 +02:00 |
|
Cedric Nugteren
|
19504ed609
|
Moved static variable declarations from .cpp to .hpp to resolve some Clang warnings
|
2017-06-25 20:59:22 +02:00 |
|
Cedric Nugteren
|
409a5a2ad0
|
Fixed a namespace clash with CUDA FP16 for the half-datatype
|
2017-04-17 16:47:15 +02:00 |
|
Cedric Nugteren
|
2673f50518
|
Merge branch 'development' into benchmarking
|
2017-04-16 19:41:14 +02:00 |
|
Cedric Nugteren
|
10205d773e
|
Added a new Xaxpy kernel in between the regular and fast version in
|
2017-04-14 20:16:10 +02:00 |
|
Cedric Nugteren
|
f24c142948
|
Made compilation of the cuBLAS wrapper work properly
|
2017-04-11 21:50:18 +02:00 |
|
Cedric Nugteren
|
6b625f8915
|
Added reference implementations for performance-testing against cuBLAS
|
2017-04-10 22:54:14 +02:00 |
|
Cedric Nugteren
|
eb1fda2729
|
In-lined the float2 and double2 types to avoid collision with CUDA's definitions
|
2017-04-03 21:44:35 +02:00 |
|
Cedric Nugteren
|
b24d364743
|
Layed the groundwork for cuBLAS comparisons in the clients
|
2017-04-02 18:06:15 +02:00 |
|
Cedric Nugteren
|
b84d2296b8
|
Separated host-device and device-host memory copies from execution of the CBLAS reference code; for fair timing and code de-duplication
|
2017-04-01 13:36:24 +02:00 |
|
Cedric Nugteren
|
a98c00a267
|
Fixed a GCC/MSVC compilation issue
|
2017-03-20 19:53:55 +01:00 |
|
Cedric Nugteren
|
068ff32e9f
|
Fixed a linker issue for Clang
|
2017-03-12 10:41:18 +01:00 |
|
Cedric Nugteren
|
49e04c7fce
|
Added API and test infrastructure for the batched GEMM routine
|
2017-03-10 21:24:35 +01:00 |
|
Cedric Nugteren
|
fa0a9c689f
|
Make batched routines based on offsets instead of a vector of cl_mem objects - undoing many earlier changes
|
2017-03-08 20:10:20 +01:00 |
|
Cedric Nugteren
|
cdf354f895
|
Adjusted the test-infrastructure to support testing of batched-versions of routines
|
2017-03-05 15:04:16 +01:00 |
|
Cedric Nugteren
|
7f14b11f1e
|
Changed the way the test-data is generated: now using a single MT generator and distribution for all data
|
2017-03-05 11:13:47 +01:00 |
|
Cedric Nugteren
|
f9a520b3af
|
Prepared generator for batched routines; added batched AXPY routine interface
|
2017-03-05 10:38:38 +01:00 |
|
Cedric Nugteren
|
e993ee077b
|
Added a proper data-preparation function for the TRSM tests
|
2017-03-04 15:21:33 +01:00 |
|
Cedric Nugteren
|
e8d5923d27
|
Made a double to float cast explicit for MSVC compatibility (C2397)
|
2017-03-01 20:42:06 +01:00 |
|
Cedric Nugteren
|
d6f1b5fca3
|
Added L2 error computation and checking for half-precision tests
|
2017-02-27 21:49:20 +01:00 |
|
Cedric Nugteren
|
ea6790665d
|
Merge branch 'development' into triangular_solvers
|
2017-02-26 14:51:45 +01:00 |
|
Cedric Nugteren
|
b7310036ed
|
Removed half-precision support from the TRSM routine; too unstable
|
2017-02-26 12:56:21 +01:00 |
|
Cedric Nugteren
|
70d8c4bad7
|
Improved the correctness tests for complex numbers in case either real or imag is much larger than the other
|
2017-02-26 10:19:53 +01:00 |
|
Cedric Nugteren
|
133ebfc834
|
Added data-preparation function for the TRSV tests and special nan/inf checks in the error checking to make the tests pass
|
2017-02-19 17:43:26 +01:00 |
|
Cedric Nugteren
|
7b2170818f
|
Changed the override-parameters test such that it is compatible with more devices
|
2017-02-18 11:22:07 +01:00 |
|
Cedric Nugteren
|
bdc57221bd
|
Added simple tests for the OverrideParameters function
|
2017-02-14 21:09:00 +01:00 |
|
Cedric Nugteren
|
c248f900c0
|
Merge branch 'development' into triangular_solvers
|
2017-02-05 22:18:59 +01:00 |
|
Cedric Nugteren
|
e7cbb5915a
|
Fixed complex version of the TRSV kernel
|
2017-02-05 14:36:31 +01:00 |
|
Ivan Shapovalov
|
064ba4abd4
|
treewide: silence type mismatch warnings in *printf()
|
2017-01-24 02:55:09 +03:00 |
|
Ivan Shapovalov
|
519ccbd273
|
Tester: always fail on OpenCL and CLBlast internal errors
These errors are self-evident and enough to fail the test even if there is
no clBLAS reference to compare error codes with.
|
2017-01-24 02:55:09 +03:00 |
|
Ivan Shapovalov
|
1a1e863ab3
|
treewide: include clpp11.hpp first to silence deprecation warnings
Otherwise, cl.h gets included through clblast.h before clpp11.hpp.
|
2017-01-20 17:32:42 +03:00 |
|
Cedric Nugteren
|
4b3ffd9989
|
Added a first version of the diagonal block invert routine in preparation of TRSM
|
2017-01-15 17:30:00 +01:00 |
|
Cedric Nugteren
|
4a4be0c3a5
|
Prints additional information in verbose/debug mode
|
2017-01-15 17:17:40 +01:00 |
|
Cedric Nugteren
|
39c49bf4f9
|
Made it possible to use the command-line environmental variables for each executable and without re-running CMake
|
2016-11-27 11:00:29 +01:00 |
|
Cedric Nugteren
|
29aab3019e
|
Fixed a bug in the error margins; relaxed the error margins for half-precision
|
2016-11-17 22:19:36 +01:00 |
|
Cedric Nugteren
|
a670c4c4bf
|
All enums in the C API are now prefixed with CLBlast to avoid potential name clashes with other projects
|
2016-10-22 16:14:56 +02:00 |
|
Cedric Nugteren
|
b0ff11acf0
|
Moved files around a bit; created a utilities subfolder
|
2016-10-22 15:36:48 +02:00 |
|