Cedric Nugteren
|
97bcf77d4b
|
First step towards supporting im2col in the test infrastructure
|
2017-07-16 22:33:49 +02:00 |
|
Cedric Nugteren
|
409a5a2ad0
|
Fixed a namespace clash with CUDA FP16 for the half-datatype
|
2017-04-17 16:47:15 +02:00 |
|
Cedric Nugteren
|
eb1fda2729
|
In-lined the float2 and double2 types to avoid collision with CUDA's definitions
|
2017-04-03 21:44:35 +02:00 |
|
Cedric Nugteren
|
49e04c7fce
|
Added API and test infrastructure for the batched GEMM routine
|
2017-03-10 21:24:35 +01:00 |
|
Cedric Nugteren
|
f9a520b3af
|
Prepared generator for batched routines; added batched AXPY routine interface
|
2017-03-05 10:38:38 +01:00 |
|
Cedric Nugteren
|
b7310036ed
|
Removed half-precision support from the TRSM routine; too unstable
|
2017-02-26 12:56:21 +01:00 |
|
Cedric Nugteren
|
4b3ffd9989
|
Added a first version of the diagonal block invert routine in preparation of TRSM
|
2017-01-15 17:30:00 +01:00 |
|
Cedric Nugteren
|
39c49bf4f9
|
Made it possible to use the command-line environmental variables for each executable and without re-running CMake
|
2016-11-27 11:00:29 +01:00 |
|
Cedric Nugteren
|
61203453aa
|
Renamed all C++ source files to .cpp to match the .hpp extension better
|
2016-06-19 13:55:49 +02:00 |
|
Cedric Nugteren
|
f726fbdc9f
|
Moved all headers into the source tree, changed headers to .hpp extension
|
2016-06-18 20:20:13 +02:00 |
|
Cedric Nugteren
|
52ccaf5b25
|
Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and/or transposing
|
2016-06-16 18:07:46 +02:00 |
|
Cedric Nugteren
|
4612ff3552
|
Added possibility to run the performance client with half-precision
|
2016-05-25 14:37:26 +02:00 |
|
cnugteren
|
894983fc3c
|
Added prototype for ixAMAX routines
|
2016-04-20 21:11:33 -06:00 |
|
cnugteren
|
e0497807e2
|
Added prototype for xASUM routines
|
2016-04-13 21:44:49 -06:00 |
|
cnugteren
|
8c3c6db7d0
|
Merge branch 'level1_routines' into development
|
2016-03-30 21:37:56 -07:00 |
|
Cedric Nugteren
|
c1df786764
|
Added prototypes for the xROTM and xROTMG routines
|
2016-03-30 16:13:37 -07:00 |
|
Cedric Nugteren
|
6ecc0d089c
|
Added prototypes for the xROT and xROTG functions
|
2016-03-30 16:13:32 -07:00 |
|
Cedric Nugteren
|
1d5a702d9d
|
Added prototypes for ScNRM2/DzNRM2 routines
|
2016-03-25 10:30:38 +01:00 |
|
Cedric Nugteren
|
3876096c30
|
Added prototypes for SNRM2/DNRM2 routines
|
2016-03-25 10:00:40 +01:00 |
|
Cedric Nugteren
|
9f682aa66b
|
Set a proper default precision for the CLBlast clients
|
2016-02-20 14:41:53 +01:00 |
|
CNugteren
|
4796c9bcbd
|
Added generated main functions for correctness/performance tests for level 2 routines
|
2015-09-18 10:19:03 +02:00 |
|
CNugteren
|
a2e726d3bd
|
Added xDOT/xDOTU/xDOTC dot-product routines
|
2015-09-14 16:57:00 +02:00 |
|
CNugteren
|
ff0c54c386
|
Added the XSWAP, XSCAL and XCOPY level-1 routines
|
2015-08-22 17:11:20 +02:00 |
|
CNugteren
|
938ca2707f
|
Added HEMV routine
|
2015-07-31 17:35:42 +02:00 |
|
CNugteren
|
b89517a2e7
|
Added SYMV routine
|
2015-07-31 17:13:41 +02:00 |
|
CNugteren
|
aa852bbe67
|
Added subfolders for the level1/2/3 routines
|
2015-07-12 16:57:09 +02:00 |
|
CNugteren
|
b5d39d9d0c
|
Added the HEMM routine, tester, and client
|
2015-07-12 15:11:50 +02:00 |
|
CNugteren
|
b02876d6e9
|
Added the HER2K routine, tester, and client
|
2015-07-10 20:59:20 +02:00 |
|
CNugteren
|
919bba3eaf
|
Added the HERK routine, tester, and client
|
2015-07-10 07:19:59 +02:00 |
|
CNugteren
|
2fe3fe1580
|
The clients now distinguish between the memory and alpha/beta data-type
|
2015-07-10 07:18:12 +02:00 |
|
CNugteren
|
d9ea0c47c6
|
Added the TRMM routine, tester, and client
|
2015-07-02 07:16:04 +02:00 |
|
CNugteren
|
2914a285d4
|
Re-organized the performance-client infrastructure to avoid code duplication
|
2015-06-29 20:38:34 +02:00 |
|
CNugteren
|
7c8d16147a
|
Added the SYR2K routine, tester, and client
|
2015-06-26 08:12:56 +02:00 |
|
CNugteren
|
75f263ce3a
|
Added symmetric matrix support for the ABC performance tester
|
2015-06-26 08:10:23 +02:00 |
|
CNugteren
|
60a88aac86
|
Added the SYRK routine, tester, and client
|
2015-06-24 07:50:18 +02:00 |
|
CNugteren
|
0a3831e6d1
|
Updated bandwidth computation for GEMM and SYMM
|
2015-06-23 08:09:46 +02:00 |
|
CNugteren
|
ea7da6a497
|
Fixed support for complex data-types for GEMM and SYMM clients
|
2015-06-21 11:21:03 +02:00 |
|
CNugteren
|
e522d1a74e
|
Added initial version of GEMV including tester and performance client
|
2015-06-13 11:01:20 +02:00 |
|
CNugteren
|
bc5a341dfe
|
Initial commit of preview version
|
2015-05-30 12:30:43 +02:00 |
|