Cedric Nugteren
|
3bb1b5fa6e
|
Merge pull request #13 from CNugteren/bypass_pre_post_processing
Bypass pre/post-processing
|
2015-07-15 22:27:56 +02:00 |
|
CNugteren
|
6908c4ebd2
|
Updated changelog with pre/post-processing bypass
|
2015-07-15 22:24:15 +02:00 |
|
CNugteren
|
ba0026d2b9
|
Changed performance graphs to default to column-major
|
2015-07-15 22:21:24 +02:00 |
|
CNugteren
|
b526623fc7
|
Skips pre/post processing kernels if not needed
|
2015-07-15 22:12:38 +02:00 |
|
CNugteren
|
0dc85845f7
|
Updated interface of the PadCopyTransposeMatrix method
|
2015-07-13 08:41:26 +02:00 |
|
Cedric Nugteren
|
530418f06f
|
Merge pull request #12 from CNugteren/level_subfolders
Added subfolders for the level1/2/3 routines
|
2015-07-12 16:59:17 +02:00 |
|
CNugteren
|
aa852bbe67
|
Added subfolders for the level1/2/3 routines
|
2015-07-12 16:57:09 +02:00 |
|
Cedric Nugteren
|
721546e64a
|
Merge pull request #11 from CNugteren/level3_routines_2
Added level-3 routines
|
2015-07-12 15:22:11 +02:00 |
|
CNugteren
|
c920400261
|
Added HEMM, HERK, HER2K, and TRMM
|
2015-07-12 15:14:35 +02:00 |
|
CNugteren
|
b5d39d9d0c
|
Added the HEMM routine, tester, and client
|
2015-07-12 15:11:50 +02:00 |
|
CNugteren
|
9a929f3fb2
|
Disabled prototype of TRSM
|
2015-07-10 21:08:18 +02:00 |
|
CNugteren
|
b02876d6e9
|
Added the HER2K routine, tester, and client
|
2015-07-10 20:59:20 +02:00 |
|
CNugteren
|
919bba3eaf
|
Added the HERK routine, tester, and client
|
2015-07-10 07:19:59 +02:00 |
|
CNugteren
|
2fe3fe1580
|
The clients now distinguish between the memory and alpha/beta data-type
|
2015-07-10 07:18:12 +02:00 |
|
CNugteren
|
5578d5ab28
|
Added option to set the imaginary part of the diagonal to zero
|
2015-07-08 07:25:18 +02:00 |
|
CNugteren
|
82469fc764
|
The testers now distinguish between the memory and alpha/beta data-type
|
2015-07-08 07:21:44 +02:00 |
|
CNugteren
|
599f9a70a6
|
Added option to set the imaginary part of the diagonal to zero
|
2015-07-07 07:34:36 +02:00 |
|
CNugteren
|
d9ea0c47c6
|
Added the TRMM routine, tester, and client
|
2015-07-02 07:16:04 +02:00 |
|
CNugteren
|
500416aa38
|
Fixed the order of arguments
|
2015-07-02 07:12:49 +02:00 |
|
CNugteren
|
d879eb3abf
|
Added a set-to-one function for kernels
|
2015-07-02 07:11:27 +02:00 |
|
CNugteren
|
e3dd35f91b
|
Added the unit/non-unit diagonal enum
|
2015-07-01 09:39:41 +02:00 |
|
CNugteren
|
b8d81a60d6
|
Fixed typos in SYMM
|
2015-07-01 09:38:04 +02:00 |
|
CNugteren
|
8574f72d46
|
Added the TRMM and TRSM interface
|
2015-06-30 07:36:11 +02:00 |
|
CNugteren
|
a591d5607d
|
Added constness to all cl_mem objects
|
2015-06-30 07:35:54 +02:00 |
|
CNugteren
|
14186af590
|
Added TRMM and TRSM clBLAS wrappers
|
2015-06-30 07:19:46 +02:00 |
|
Cedric Nugteren
|
cbf2eef179
|
Merge pull request #10 from CNugteren/test_infrastructure
Re-organized test infrastructure
|
2015-06-29 20:45:10 +02:00 |
|
CNugteren
|
3726f6a618
|
Re-organized test and client infrastructure
|
2015-06-29 20:42:34 +02:00 |
|
CNugteren
|
ede78fe499
|
Fixed the license for the correctness testers
|
2015-06-29 20:39:51 +02:00 |
|
CNugteren
|
2914a285d4
|
Re-organized the performance-client infrastructure to avoid code duplication
|
2015-06-29 20:38:34 +02:00 |
|
CNugteren
|
e5c0edbfd7
|
Re-organized the test infrastructure to avoid code duplication
|
2015-06-28 15:52:57 +02:00 |
|
CNugteren
|
cf1892d22c
|
Added buffer structure and sizes to arguments
|
2015-06-28 15:37:38 +02:00 |
|
Cedric Nugteren
|
77e2157485
|
Merge pull request #9 from CNugteren/level3_routines
Added SYRK and SYR2K level-3 routines
|
2015-06-26 20:56:21 +02:00 |
|
CNugteren
|
e27e339ebf
|
Replaced crosses with tickmarks
|
2015-06-26 17:43:17 +02:00 |
|
CNugteren
|
7c8d16147a
|
Added the SYR2K routine, tester, and client
|
2015-06-26 08:12:56 +02:00 |
|
CNugteren
|
75f263ce3a
|
Added symmetric matrix support for the ABC performance tester
|
2015-06-26 08:10:23 +02:00 |
|
CNugteren
|
ff9f9fac57
|
Added option to test only symmetric matrices (m=n)
|
2015-06-25 20:39:34 +02:00 |
|
CNugteren
|
57c705dbf2
|
Clarified comment
|
2015-06-25 20:38:34 +02:00 |
|
CNugteren
|
96e4012349
|
Added SSYRK performance graphs
|
2015-06-25 19:19:31 +02:00 |
|
CNugteren
|
3de4471afe
|
Added the SYRK routine
|
2015-06-24 07:52:19 +02:00 |
|
CNugteren
|
60a88aac86
|
Added the SYRK routine, tester, and client
|
2015-06-24 07:50:18 +02:00 |
|
CNugteren
|
a17297937d
|
Added performance-client for AC routines
|
2015-06-23 22:31:27 +02:00 |
|
CNugteren
|
9fc38cdf5e
|
Added a lower/upper triangular version of the GEMM kernel
|
2015-06-23 17:58:51 +02:00 |
|
CNugteren
|
0a3831e6d1
|
Updated bandwidth computation for GEMM and SYMM
|
2015-06-23 08:09:46 +02:00 |
|
CNugteren
|
20eb3506d6
|
Added a condition to update only lower/upper triangular parts in the un-pad kernels
|
2015-06-23 08:09:07 +02:00 |
|
CNugteren
|
4c2a166bc5
|
Added test infrastructure for AB and AC routines
|
2015-06-21 12:57:38 +02:00 |
|
CNugteren
|
e3829c1067
|
Added prototypes of SYRK and SYR2K
|
2015-06-21 12:44:03 +02:00 |
|
CNugteren
|
ea7da6a497
|
Fixed support for complex data-types for GEMM and SYMM clients
|
2015-06-21 11:21:03 +02:00 |
|
Cedric Nugteren
|
18251df848
|
Merge pull request #7 from CNugteren/development
Update to version 0.2.0
|
2015-06-21 09:15:41 +02:00 |
|
CNugteren
|
985eeac503
|
Updated to version 0.2.0
|
2015-06-21 09:13:08 +02:00 |
|
CNugteren
|
6aac23be86
|
Updated performance graphs for Intel Iris GPUs
|
2015-06-21 09:12:42 +02:00 |
|