CNugteren
|
ff0c54c386
|
Added the XSWAP, XSCAL and XCOPY level-1 routines
|
2015-08-22 17:11:20 +02:00 |
|
CNugteren
|
75517353d5
|
Re-organized level1 xaxpy kernel
|
2015-08-22 14:33:48 +02:00 |
|
Cedric Nugteren
|
cf168fca70
|
Merge pull request #23 from CNugteren/tuner_database
Added initial version of a tuner-database
|
2015-08-20 08:38:18 +02:00 |
|
CNugteren
|
15db2bcc20
|
Added initial version of tuner-database Python script
|
2015-08-20 08:30:51 +02:00 |
|
CNugteren
|
b46de22433
|
Moved precision tester to utilities
|
2015-08-19 19:34:29 +02:00 |
|
CNugteren
|
cbd25bffea
|
Added hotfix 8eeb7f721f
|
2015-08-19 11:12:16 +02:00 |
|
Cedric Nugteren
|
4f6e42d052
|
Merge pull request #21 from CNugteren/c_api
Added a plain C API
|
2015-08-13 18:02:03 +02:00 |
|
CNugteren
|
603e389545
|
Added all supported routines to the C API
|
2015-08-13 17:58:46 +02:00 |
|
CNugteren
|
8eeb7f721f
|
Fixed a complex data-type bug in the transpose kernel
|
2015-08-13 14:33:42 +02:00 |
|
CNugteren
|
8617195ac5
|
Added initial version of C API with just one routine
|
2015-08-13 13:46:13 +02:00 |
|
CNugteren
|
dbdb58c600
|
Refactored the tuners, added JSON output
|
2015-08-09 15:50:41 +02:00 |
|
CNugteren
|
75b4d92ac3
|
Added distinguished names for GEMV inherited HEMV/SYMV
|
2015-08-04 08:15:39 +02:00 |
|
CNugteren
|
d1a7cf18ec
|
Abstracted loading of matrix A for GEMV kernel
|
2015-08-03 07:37:14 +02:00 |
|
CNugteren
|
938ca2707f
|
Added HEMV routine
|
2015-07-31 17:35:42 +02:00 |
|
CNugteren
|
b89517a2e7
|
Added SYMV routine
|
2015-07-31 17:13:41 +02:00 |
|
CNugteren
|
f7199b831f
|
Now using the new Claduc C++11 OpenCL header
|
2015-07-27 07:18:06 +02:00 |
|
CNugteren
|
4dcecfe934
|
Added workgroup shuffle option to transpose kernel for AMD GPUs
|
2015-07-22 07:31:16 +02:00 |
|
CNugteren
|
d93efa3169
|
Transpose kernel now uses vectorized local memory loads and stores
|
2015-07-21 08:22:18 +02:00 |
|
CNugteren
|
a0f0f6c8ce
|
Triangular GEMM kernels are only compiled when needed
|
2015-07-19 16:36:12 +02:00 |
|
CNugteren
|
48e2e96f1b
|
Kernel caching is now based on a routine's name
|
2015-07-19 16:24:14 +02:00 |
|
CNugteren
|
4e499a67c1
|
The kernel source string is now a routine's member variable
|
2015-07-19 13:44:37 +02:00 |
|
CNugteren
|
9300261bd4
|
Fixed a bug when using the Xgemm kernel without local memory
|
2015-07-16 22:49:55 +02:00 |
|
CNugteren
|
0157d6d4ea
|
Using mad() instruction for AMD devices like clBLAS does
|
2015-07-16 22:42:02 +02:00 |
|
CNugteren
|
b526623fc7
|
Skips pre/post processing kernels if not needed
|
2015-07-15 22:12:38 +02:00 |
|
CNugteren
|
0dc85845f7
|
Updated interface of the PadCopyTransposeMatrix method
|
2015-07-13 08:41:26 +02:00 |
|
CNugteren
|
aa852bbe67
|
Added subfolders for the level1/2/3 routines
|
2015-07-12 16:57:09 +02:00 |
|
CNugteren
|
b5d39d9d0c
|
Added the HEMM routine, tester, and client
|
2015-07-12 15:11:50 +02:00 |
|
CNugteren
|
9a929f3fb2
|
Disabled prototype of TRSM
|
2015-07-10 21:08:18 +02:00 |
|
CNugteren
|
b02876d6e9
|
Added the HER2K routine, tester, and client
|
2015-07-10 20:59:20 +02:00 |
|
CNugteren
|
919bba3eaf
|
Added the HERK routine, tester, and client
|
2015-07-10 07:19:59 +02:00 |
|
CNugteren
|
5578d5ab28
|
Added option to set the imaginary part of the diagonal to zero
|
2015-07-08 07:25:18 +02:00 |
|
CNugteren
|
599f9a70a6
|
Added option to set the imaginary part of the diagonal to zero
|
2015-07-07 07:34:36 +02:00 |
|
CNugteren
|
d9ea0c47c6
|
Added the TRMM routine, tester, and client
|
2015-07-02 07:16:04 +02:00 |
|
CNugteren
|
d879eb3abf
|
Added a set-to-one function for kernels
|
2015-07-02 07:11:27 +02:00 |
|
CNugteren
|
e3dd35f91b
|
Added the unit/non-unit diagonal enum
|
2015-07-01 09:39:41 +02:00 |
|
CNugteren
|
b8d81a60d6
|
Fixed typos in SYMM
|
2015-07-01 09:38:04 +02:00 |
|
CNugteren
|
8574f72d46
|
Added the TRMM and TRSM interface
|
2015-06-30 07:36:11 +02:00 |
|
CNugteren
|
7c8d16147a
|
Added the SYR2K routine, tester, and client
|
2015-06-26 08:12:56 +02:00 |
|
CNugteren
|
57c705dbf2
|
Clarified comment
|
2015-06-25 20:38:34 +02:00 |
|
CNugteren
|
60a88aac86
|
Added the SYRK routine, tester, and client
|
2015-06-24 07:50:18 +02:00 |
|
CNugteren
|
9fc38cdf5e
|
Added a lower/upper triangular version of the GEMM kernel
|
2015-06-23 17:58:51 +02:00 |
|
CNugteren
|
20eb3506d6
|
Added a condition to update only lower/upper triangular parts in the un-pad kernels
|
2015-06-23 08:09:07 +02:00 |
|
CNugteren
|
e3829c1067
|
Added prototypes of SYRK and SYR2K
|
2015-06-21 12:44:03 +02:00 |
|
CNugteren
|
3ea3ba2bee
|
Distinguish between a short smoke test and a full test
|
2015-06-20 13:33:50 +02:00 |
|
CNugteren
|
e26742c629
|
Added additional absolute error checking when testing
|
2015-06-20 10:58:21 +02:00 |
|
CNugteren
|
682c01a80c
|
Now returns program from database by reference
|
2015-06-18 18:44:14 +02:00 |
|
CNugteren
|
7e176ccac9
|
Added support for conjugate transpose in GEMV
|
2015-06-16 08:42:52 +02:00 |
|
CNugteren
|
af78a04eca
|
Updated the tuners to set the conjugate argument
|
2015-06-16 07:50:45 +02:00 |
|
CNugteren
|
e03582a112
|
Added support for CGEMM/ZGEMM and CSYMM/ZSYMM
|
2015-06-16 07:45:09 +02:00 |
|
CNugteren
|
8f01c644b5
|
Added support for complex conjugate transpose
|
2015-06-16 07:43:19 +02:00 |
|