Cedric Nugteren
844e68853e
Moved some utility functions to a test-specific utility compilation-unit
2017-08-12 15:38:17 +02:00
Cedric Nugteren
97bcf77d4b
First step towards supporting im2col in the test infrastructure
2017-07-16 22:33:49 +02:00
Cedric Nugteren
de9ed9d4ea
Fixed batched tests when testing for invalid sizes against clBLAS
2017-07-12 21:54:16 +02:00
Cedric Nugteren
f77b48692b
Relaxed requirement on a_ld and b_ld for batched GEMM
2017-07-12 21:53:39 +02:00
Cedric Nugteren
d4c8a7c8b0
Changed printf-statements with %zu into std::cout to fix MSVC 2013 compatibility
2017-07-09 20:19:08 +02:00
Cedric Nugteren
4b415bdf3c
Disabled UNIX-style terminal color printing under Windows
2017-07-09 20:04:13 +02:00
Cedric Nugteren
4e51b1e1f8
Moved and inlined some static member variables and disabled spurious clang warnings
2017-06-27 21:05:16 +02:00
Cedric Nugteren
e60b10529a
Undo of earlier move of TestBlas::kTransposes constant to fix MSVC 2013 compilation
2017-06-27 20:59:28 +02:00
Cedric Nugteren
ce528a9d39
Fixed and suppresses several warnings for MSVC
2017-06-26 21:38:04 +02:00
Cedric Nugteren
19504ed609
Moved static variable declarations from .cpp to .hpp to resolve some Clang warnings
2017-06-25 20:59:22 +02:00
Cedric Nugteren
1a8ed48a35
Fixed some Clang and MSVC warnings
2017-06-25 11:50:36 +02:00
Cedric Nugteren
93c8db7fe7
Bug-fix in the half-precision test of the amax routine
2017-05-11 22:19:15 -07:00
Cedric Nugteren
049d0fc95a
Fixed a compiler warning message
2017-04-23 10:45:08 +02:00
Cedric Nugteren
409a5a2ad0
Fixed a namespace clash with CUDA FP16 for the half-datatype
2017-04-17 16:47:15 +02:00
Cedric Nugteren
2673f50518
Merge branch 'development' into benchmarking
2017-04-16 19:41:14 +02:00
Cedric Nugteren
e3bb58f602
Finalized support for performance testing against cuBLAS
2017-04-16 17:53:51 +02:00
Cedric Nugteren
10205d773e
Added a new Xaxpy kernel in between the regular and fast version in
2017-04-14 20:16:10 +02:00
Cedric Nugteren
f7f8ec644f
Fixed CUDA malloc and cuBLAS handles: cuBLAS as a performance-reference now works
2017-04-13 21:31:27 +02:00
Cedric Nugteren
f24c142948
Made compilation of the cuBLAS wrapper work properly
2017-04-11 21:50:18 +02:00
Cedric Nugteren
6b625f8915
Added reference implementations for performance-testing against cuBLAS
2017-04-10 22:54:14 +02:00
Cedric Nugteren
52dd7433ca
Completed the cuBLAS wrapper
2017-04-06 20:56:28 +02:00
Cedric Nugteren
dbe22b5bf3
Fixed some size_t to int conversion warnings for the CBLAS interface
2017-04-06 19:40:51 +02:00
Cedric Nugteren
674ff96fdf
Added a first version of a cuBLAS wrapper (WIP)
2017-04-05 21:27:25 +02:00
Cedric Nugteren
af9a521042
Fixes the CUDA wrapper (now actually tested on a system with CUDA)
2017-04-03 21:46:07 +02:00
Cedric Nugteren
eb1fda2729
In-lined the float2 and double2 types to avoid collision with CUDA's definitions
2017-04-03 21:44:35 +02:00
Cedric Nugteren
b24d364743
Layed the groundwork for cuBLAS comparisons in the clients
2017-04-02 18:06:15 +02:00
Cedric Nugteren
c5461d77e5
Factored out inclusion of clBLAS and CBLAS from the test-routine files
2017-04-02 15:24:21 +02:00
Cedric Nugteren
a9c25e9fd2
Factored out inclusion of clBLAS and CBLAS from the test-routine files
2017-04-02 15:21:19 +02:00
Cedric Nugteren
b84d2296b8
Separated host-device and device-host memory copies from execution of the CBLAS reference code; for fair timing and code de-duplication
2017-04-01 13:36:24 +02:00
Cedric Nugteren
a98c00a267
Fixed a GCC/MSVC compilation issue
2017-03-20 19:53:55 +01:00
Cedric Nugteren
0610447a7a
Fixed a compilation issue for GCC/MSVC
2017-03-19 17:37:52 +01:00
Cedric Nugteren
068ff32e9f
Fixed a linker issue for Clang
2017-03-12 10:41:18 +01:00
Cedric Nugteren
49e04c7fce
Added API and test infrastructure for the batched GEMM routine
2017-03-10 21:24:35 +01:00
Cedric Nugteren
3846f44eaf
Small fix for a file that isn't currently compiled anymore
2017-03-10 20:53:20 +01:00
Cedric Nugteren
d754586b49
Added proper testing of the alpha parameter; finalized the batched AXPY implementation
2017-03-10 20:49:59 +01:00
Cedric Nugteren
fa0a9c689f
Make batched routines based on offsets instead of a vector of cl_mem objects - undoing many earlier changes
2017-03-08 20:10:20 +01:00
Cedric Nugteren
6aba0bbae7
Minor fixes to the client w.r.t. the addition of the batch count
2017-03-05 16:44:16 +01:00
Cedric Nugteren
b114ea49a9
Added first naive version of the batched AXPY routine
2017-03-05 15:06:14 +01:00
Cedric Nugteren
cdf354f895
Adjusted the test-infrastructure to support testing of batched-versions of routines
2017-03-05 15:04:16 +01:00
Cedric Nugteren
7f14b11f1e
Changed the way the test-data is generated: now using a single MT generator and distribution for all data
2017-03-05 11:13:47 +01:00
Cedric Nugteren
f9a520b3af
Prepared generator for batched routines; added batched AXPY routine interface
2017-03-05 10:38:38 +01:00
Cedric Nugteren
37228c9098
Fixed a missing include for the tests
2017-03-04 20:45:39 +01:00
Cedric Nugteren
e993ee077b
Added a proper data-preparation function for the TRSM tests
2017-03-04 15:21:33 +01:00
Cedric Nugteren
e8d5923d27
Made a double to float cast explicit for MSVC compatibility (C2397)
2017-03-01 20:42:06 +01:00
Cedric Nugteren
d6f1b5fca3
Added L2 error computation and checking for half-precision tests
2017-02-27 21:49:20 +01:00
Cedric Nugteren
ea6790665d
Merge branch 'development' into triangular_solvers
2017-02-26 14:51:45 +01:00
Cedric Nugteren
a145890aaa
Added a guard against invalid buffer sizes in the prepare-data functions for tests
2017-02-26 14:37:29 +01:00
Cedric Nugteren
b7310036ed
Removed half-precision support from the TRSM routine; too unstable
2017-02-26 12:56:21 +01:00
Cedric Nugteren
70d8c4bad7
Improved the correctness tests for complex numbers in case either real or imag is much larger than the other
2017-02-26 10:19:53 +01:00
Cedric Nugteren
e47d95887c
Added PrepareData function for TRSM to create proper test input
2017-02-25 12:23:04 +01:00
Cedric Nugteren
133ebfc834
Added data-preparation function for the TRSV tests and special nan/inf checks in the error checking to make the tests pass
2017-02-19 17:43:26 +01:00
Cedric Nugteren
7b2170818f
Changed the override-parameters test such that it is compatible with more devices
2017-02-18 11:22:07 +01:00
Cedric Nugteren
bdc57221bd
Added simple tests for the OverrideParameters function
2017-02-14 21:09:00 +01:00
Cedric Nugteren
c248f900c0
Merge branch 'development' into triangular_solvers
2017-02-05 22:18:59 +01:00
Cedric Nugteren
e7cbb5915a
Fixed complex version of the TRSV kernel
2017-02-05 14:36:31 +01:00
Ivan Shapovalov
064ba4abd4
treewide: silence type mismatch warnings in *printf()
2017-01-24 02:55:09 +03:00
Ivan Shapovalov
519ccbd273
Tester: always fail on OpenCL and CLBlast internal errors
...
These errors are self-evident and enough to fail the test even if there is
no clBLAS reference to compare error codes with.
2017-01-24 02:55:09 +03:00
Ivan Shapovalov
1a1e863ab3
treewide: include clpp11.hpp first to silence deprecation warnings
...
Otherwise, cl.h gets included through clblast.h before clpp11.hpp.
2017-01-20 17:32:42 +03:00
Cedric Nugteren
a5fd2323b6
Added prototype for the TRSV routine
2017-01-20 11:30:32 +01:00
Cedric Nugteren
df9a77d74d
Added first version of the TRSM routine based on the diagonal invert kernel
2017-01-18 21:29:59 +01:00
Cedric Nugteren
4b3ffd9989
Added a first version of the diagonal block invert routine in preparation of TRSM
2017-01-15 17:30:00 +01:00
Cedric Nugteren
4a4be0c3a5
Prints additional information in verbose/debug mode
2017-01-15 17:17:40 +01:00
Cedric Nugteren
681a465b35
Prepared for the addition of the TRSM triangular solver kernel
2016-12-18 12:30:16 +01:00
Cedric Nugteren
39c49bf4f9
Made it possible to use the command-line environmental variables for each executable and without re-running CMake
2016-11-27 11:00:29 +01:00
Cedric Nugteren
60fa2322ca
Added a proper half-precision reference for testing of xomatcopy
2016-11-17 22:20:16 +01:00
Cedric Nugteren
29aab3019e
Fixed a bug in the error margins; relaxed the error margins for half-precision
2016-11-17 22:19:36 +01:00
Cedric Nugteren
a670c4c4bf
All enums in the C API are now prefixed with CLBlast to avoid potential name clashes with other projects
2016-10-22 16:14:56 +02:00
Cedric Nugteren
b0ff11acf0
Moved files around a bit; created a utilities subfolder
2016-10-22 15:36:48 +02:00
Cedric Nugteren
d0b8ca9fba
Fixed compilation issues of the testers for Visual Studio 2013: mostly conversions of class constants to static
2016-10-18 10:19:03 +02:00
Cedric Nugteren
6178fcd584
Now generates test/client/tuner data using a fixed seed to enable reproducability of results
2016-09-27 19:55:21 +02:00
Cedric Nugteren
e3076d26cc
Added more relaxed error checking for the half-precision tests
2016-09-27 19:42:58 +02:00
Cedric Nugteren
d595a8ed7e
Fixed a bug waiting for an invalid event in case of a non-succesfull CLBlast call in the tests and samples
2016-09-22 20:47:22 +02:00
Ivan Shapovalov
ea43936e94
test/correctness: read platform and device from environment
...
Support passing environment variables CLBLAST_PLATFORM and CLBLAST_DEVICE
instead of -platform and -device arguments to test executables.
This is for `ctest`.
2016-08-27 05:37:26 +03:00
Cedric Nugteren
77325b8974
Added an option to the performance clients to do a warm-up run before timing
2016-07-06 21:25:55 +02:00
CNugteren
2c031f3e1d
Made it possible to build the OMATCOPY test and client in case only clBLAS is present
2016-06-28 16:36:01 +02:00
Cedric Nugteren
69beca90f4
Moved the performance graph scripts to the 'scripts' subfolder
2016-06-27 11:51:57 +02:00
Cedric Nugteren
fdfbc9af13
Changed the symbol for error-code skipped tests to distinguish from succesfull error-code checks in the correctness tests
2016-06-27 11:27:54 +02:00
Cedric Nugteren
8f7131bd90
Increased the verbosity of the '-verbose' option for the correctness tests, now printing when a library is called
2016-06-27 11:16:30 +02:00
Cedric Nugteren
61203453aa
Renamed all C++ source files to .cpp to match the .hpp extension better
2016-06-19 13:55:49 +02:00
Cedric Nugteren
f726fbdc9f
Moved all headers into the source tree, changed headers to .hpp extension
2016-06-18 20:20:13 +02:00
Cedric Nugteren
52ccaf5b25
Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and/or transposing
2016-06-16 18:07:46 +02:00
Cedric Nugteren
6d6b030053
Made the CPU BLAS library the default reference to test against in favor of clBLAS
2016-06-08 09:21:39 +02:00
Cedric Nugteren
c1895ea459
Made the tests for invalid buffer sizes also verbose in verbose mode
2016-06-06 12:20:42 +02:00
Cedric Nugteren
e561e3fbd5
Added return value to the test binaries (0: success, 1: failure), allowing it to work under CTest properly
2016-06-02 16:24:22 +02:00
Cedric Nugteren
f6b2cd9579
Increased the verbosity of the -verbose option in the correctness tests
2016-05-30 20:07:09 +02:00
Cedric Nugteren
03182f9d07
Added half-precision tests for the clBLAS reference through conversion to single-precision
2016-05-26 23:36:19 +02:00
Cedric Nugteren
b487d4dd44
Added half-precision tests for the CBLAS reference through conversion to single-precison
2016-05-26 13:15:27 +02:00
Cedric Nugteren
4612ff3552
Added possibility to run the performance client with half-precision
2016-05-25 14:37:26 +02:00
Cedric Nugteren
803aaf3070
Added level-1 half-precision routines HSWAP/HSCAL/HCOPY/HAXPY/HDOT/HNRM2/HASUM/HSUM/iHAMAX/iHMAX/iHMIN
2016-05-22 14:47:14 +02:00
Cedric Nugteren
489c5d76cf
Merged in latest changes from 0.7.1 release
2016-05-18 21:32:56 +02:00
CNugteren
942912daeb
Fixes for compilation of the tests under Visual Studio 2015
2016-05-08 21:11:37 +02:00
cnugteren
3b81ee2c08
Fixed an issue where the xAMAX tester would incorrectly report failures when testing against CBLAS
2016-05-08 18:28:01 +02:00
cnugteren
eaf1de5745
Fixed an issue where the xNRM2 and xASUM testers would incorrectly report failures for complex inputs
2016-05-08 18:07:55 +02:00
cnugteren
1acb31896c
Fixed an issue with computing the GFLOPS numbers for the xGEMM performance tests for non-square matrices
2016-05-08 10:06:06 +02:00
Cedric Nugteren
6c9e08c5e2
Added an option to the tests to control whether to test against clBLAS or a CPU BLAS library
2016-05-07 12:22:06 +02:00
Cedric Nugteren
56aa1701c9
Added printing of indices when testing in verbose mode
2016-05-05 23:09:57 +02:00
Cedric Nugteren
aa97c836b1
Fixed an issue with linking against the ATLAS BLAS library
2016-05-04 19:16:09 +02:00
Cedric Nugteren
44bdb60e83
Relaxed the absolute error margin for floating-point value comparisons to 1e-4
2016-04-27 14:42:30 +02:00
Cedric Nugteren
226e834d0a
Added a '-verbose' option to the test binaries to report errors in more detail if needed
2016-04-27 14:38:30 +02:00
Cedric Nugteren
3555cd0436
All CLBlast enum constants now have the same raw values as in the cblas standard
2016-04-27 11:37:55 +02:00