Commit graph

103 commits

Author SHA1 Message Date
Cedric Nugteren b84d2296b8 Separated host-device and device-host memory copies from execution of the CBLAS reference code; for fair timing and code de-duplication 2017-04-01 13:36:24 +02:00
Cedric Nugteren 49e04c7fce Added API and test infrastructure for the batched GEMM routine 2017-03-10 21:24:35 +01:00
Cedric Nugteren 3846f44eaf Small fix for a file that isn't currently compiled anymore 2017-03-10 20:53:20 +01:00
Cedric Nugteren d754586b49 Added proper testing of the alpha parameter; finalized the batched AXPY implementation 2017-03-10 20:49:59 +01:00
Cedric Nugteren fa0a9c689f Make batched routines based on offsets instead of a vector of cl_mem objects - undoing many earlier changes 2017-03-08 20:10:20 +01:00
Cedric Nugteren 6aba0bbae7 Minor fixes to the client w.r.t. the addition of the batch count 2017-03-05 16:44:16 +01:00
Cedric Nugteren b114ea49a9 Added first naive version of the batched AXPY routine 2017-03-05 15:06:14 +01:00
Cedric Nugteren cdf354f895 Adjusted the test-infrastructure to support testing of batched-versions of routines 2017-03-05 15:04:16 +01:00
Cedric Nugteren 7f14b11f1e Changed the way the test-data is generated: now using a single MT generator and distribution for all data 2017-03-05 11:13:47 +01:00
Cedric Nugteren 37228c9098 Fixed a missing include for the tests 2017-03-04 20:45:39 +01:00
Cedric Nugteren e993ee077b Added a proper data-preparation function for the TRSM tests 2017-03-04 15:21:33 +01:00
Cedric Nugteren a145890aaa Added a guard against invalid buffer sizes in the prepare-data functions for tests 2017-02-26 14:37:29 +01:00
Cedric Nugteren e47d95887c Added PrepareData function for TRSM to create proper test input 2017-02-25 12:23:04 +01:00
Cedric Nugteren 133ebfc834 Added data-preparation function for the TRSV tests and special nan/inf checks in the error checking to make the tests pass 2017-02-19 17:43:26 +01:00
Cedric Nugteren a5fd2323b6 Added prototype for the TRSV routine 2017-01-20 11:30:32 +01:00
Cedric Nugteren df9a77d74d Added first version of the TRSM routine based on the diagonal invert kernel 2017-01-18 21:29:59 +01:00
Cedric Nugteren 4b3ffd9989 Added a first version of the diagonal block invert routine in preparation of TRSM 2017-01-15 17:30:00 +01:00
Cedric Nugteren 681a465b35 Prepared for the addition of the TRSM triangular solver kernel 2016-12-18 12:30:16 +01:00
Cedric Nugteren 60fa2322ca Added a proper half-precision reference for testing of xomatcopy 2016-11-17 22:20:16 +01:00
Cedric Nugteren d595a8ed7e Fixed a bug waiting for an invalid event in case of a non-succesfull CLBlast call in the tests and samples 2016-09-22 20:47:22 +02:00
CNugteren 2c031f3e1d Made it possible to build the OMATCOPY test and client in case only clBLAS is present 2016-06-28 16:36:01 +02:00
Cedric Nugteren f726fbdc9f Moved all headers into the source tree, changed headers to .hpp extension 2016-06-18 20:20:13 +02:00
Cedric Nugteren 52ccaf5b25 Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and/or transposing 2016-06-16 18:07:46 +02:00
Cedric Nugteren 03182f9d07 Added half-precision tests for the clBLAS reference through conversion to single-precision 2016-05-26 23:36:19 +02:00
cnugteren 1acb31896c Fixed an issue with computing the GFLOPS numbers for the xGEMM performance tests for non-square matrices 2016-05-08 10:06:06 +02:00
Cedric Nugteren 3555cd0436 All CLBlast enum constants now have the same raw values as in the cblas standard 2016-04-27 11:37:55 +02:00
cnugteren 16a048f1ac Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routines 2016-04-20 22:12:51 -06:00
cnugteren 8be99de82d Added support for the SASUM/DASUM/ScASUM/DzASUM routines 2016-04-14 19:58:26 -06:00
cnugteren 1a82861a90 Added support for testing (performance and correctness) against a CPU BLAS library 2016-04-02 11:58:00 -07:00
Cedric Nugteren aaa687ca98 Added preliminary support for the xNRM2 routines 2016-03-28 23:00:44 +02:00
Cedric Nugteren 306bf67660 Added preliminary support for xHPR2 and xSPR2 routines 2016-03-06 15:48:11 +01:00
Cedric Nugteren 60da54da5d Added preliminary support for xHER2 and xSYR2 routines 2016-03-02 21:18:01 +01:00
Cedric Nugteren e3545215a5 Added support for xHER, xHPR, xSYR, and xSPR routines 2016-02-28 14:16:48 +01:00
Cedric Nugteren 6dc44da07b Added support for xGERU and xGERC routines 2016-02-20 14:15:41 +01:00
Cedric Nugteren 8854a73127 Added XGER routine, kernel, and tuner 2016-02-20 12:40:01 +01:00
CNugteren 2b56c2c603 Added TRMV/TBMV/TPMV routines 2015-09-26 16:58:03 +02:00
CNugteren de6547a92b Added SBMV and SPMV routines 2015-09-19 18:01:19 +02:00
CNugteren 80da67d28b Added the HPMV routine 2015-09-19 17:40:38 +02:00
CNugteren aebd156869 Added the HBMV routine 2015-09-19 11:11:34 +02:00
CNugteren 4507ba4997 Added first version of banded matrix-vector multiplication 2015-09-18 15:25:20 +02:00
CNugteren a2e726d3bd Added xDOT/xDOTU/xDOTC dot-product routines 2015-09-14 16:57:00 +02:00
CNugteren ff0c54c386 Added the XSWAP, XSCAL and XCOPY level-1 routines 2015-08-22 17:11:20 +02:00
CNugteren 938ca2707f Added HEMV routine 2015-07-31 17:35:42 +02:00
CNugteren b89517a2e7 Added SYMV routine 2015-07-31 17:13:41 +02:00
CNugteren c5d5adbddd Refactored the correctness tests 2015-07-31 15:52:13 +02:00
CNugteren f7199b831f Now using the new Claduc C++11 OpenCL header 2015-07-27 07:18:06 +02:00
CNugteren aa852bbe67 Added subfolders for the level1/2/3 routines 2015-07-12 16:57:09 +02:00
CNugteren b5d39d9d0c Added the HEMM routine, tester, and client 2015-07-12 15:11:50 +02:00
CNugteren b02876d6e9 Added the HER2K routine, tester, and client 2015-07-10 20:59:20 +02:00
CNugteren 919bba3eaf Added the HERK routine, tester, and client 2015-07-10 07:19:59 +02:00
CNugteren d9ea0c47c6 Added the TRMM routine, tester, and client 2015-07-02 07:16:04 +02:00
CNugteren 2914a285d4 Re-organized the performance-client infrastructure to avoid code duplication 2015-06-29 20:38:34 +02:00
CNugteren e5c0edbfd7 Re-organized the test infrastructure to avoid code duplication 2015-06-28 15:52:57 +02:00