Cedric Nugteren
|
1155c068e9
|
Updated to version 1.0.0
|
2017-07-30 20:54:21 +02:00 |
|
Cedric Nugteren
|
ae068771da
|
Fixes for Travis automatic deployment
|
2017-07-30 18:53:26 +02:00 |
|
Cedric Nugteren
|
b494df1111
|
Fixes warnings for Clang & AppleClang
|
2017-07-30 18:52:20 +02:00 |
|
Cedric Nugteren
|
6ceb9b7152
|
Fixes to AppVeyor and Travis scripts
|
2017-07-30 18:34:39 +02:00 |
|
Cedric Nugteren
|
e6f938e0e9
|
Improved deployment procedure of automatic builds
|
2017-07-30 18:19:46 +02:00 |
|
Cedric Nugteren
|
18d832e149
|
Added tuning results for the Qualcomm Adreno 330 GPU
|
2017-07-30 18:18:02 +02:00 |
|
Cedric Nugteren
|
0ea16a0e63
|
Minor optimization for the direct GEMM kernel: don't ceil m and n unnecessarily high
|
2017-07-25 20:53:12 +02:00 |
|
Cedric Nugteren
|
b7473f50df
|
Added status badges for correctness tests; updated list of contributors; fixed minor typos
|
2017-07-24 20:14:47 +02:00 |
|
Cedric Nugteren
|
55861c40ff
|
Merge branch 'relax_gemmbatched_ld_requirements'
|
2017-07-23 21:04:17 +02:00 |
|
mcian
|
473e814718
|
Code refactoring
|
2017-07-23 14:48:13 +02:00 |
|
Cedric Nugteren
|
2d52f9b1d3
|
Merge pull request #176 from CNugteren/inline_keyword_optional
Made the inline keyword in kernels optional
|
2017-07-22 10:44:08 +02:00 |
|
Cedric Nugteren
|
1b4959a16a
|
Merge pull request #175 from mcian/Arm_Threshold
Add new threshold for ARM
|
2017-07-19 19:27:32 +02:00 |
|
mcian
|
a36283aaec
|
Add new threshold for ARM
|
2017-07-17 12:20:46 +02:00 |
|
mcian
|
8131e68664
|
Add PSO parameters support and search strategy selection from command line
|
2017-07-17 12:00:25 +02:00 |
|
Cedric Nugteren
|
97bcf77d4b
|
First step towards supporting im2col in the test infrastructure
|
2017-07-16 22:33:49 +02:00 |
|
Cedric Nugteren
|
de9ed9d4ea
|
Fixed batched tests when testing for invalid sizes against clBLAS
|
2017-07-12 21:54:16 +02:00 |
|
Cedric Nugteren
|
f77b48692b
|
Relaxed requirement on a_ld and b_ld for batched GEMM
|
2017-07-12 21:53:39 +02:00 |
|
Cedric Nugteren
|
f2477f6636
|
Removed spurious warning for Clang < 3.9
|
2017-07-12 20:58:31 +02:00 |
|
Cedric Nugteren
|
c71362b13d
|
Merge pull request #172 from CNugteren/msvc_improvements
Windows & MSVC improvements
|
2017-07-09 21:14:52 +02:00 |
|
Cedric Nugteren
|
d4c8a7c8b0
|
Changed printf-statements with %zu into std::cout to fix MSVC 2013 compatibility
|
2017-07-09 20:19:08 +02:00 |
|
Cedric Nugteren
|
4b415bdf3c
|
Disabled UNIX-style terminal color printing under Windows
|
2017-07-09 20:04:13 +02:00 |
|
Cedric Nugteren
|
442c31dd50
|
Made the inline keyword in kernels optional currently only enabled for NVIDIA and ARM GPUs
|
2017-07-08 17:12:16 +02:00 |
|
Cedric Nugteren
|
84ec50e29d
|
Added interface and stubs for the im2col routine
|
2017-07-02 12:10:22 +02:00 |
|
Cedric Nugteren
|
75c0e861b8
|
Merge branch 'gemm_direct_bug'
|
2017-07-01 14:44:29 +02:00 |
|
Cedric Nugteren
|
4cf516cfec
|
Fixed an if-statement in the direct GEMM kernel causing a bug with specific sets of input parameters
|
2017-06-30 21:57:41 +02:00 |
|
Cedric Nugteren
|
52881f3864
|
Added batched GEMM example program
|
2017-06-29 21:15:25 +02:00 |
|
Cedric Nugteren
|
4e51b1e1f8
|
Moved and inlined some static member variables and disabled spurious clang warnings
|
2017-06-27 21:05:16 +02:00 |
|
Cedric Nugteren
|
e60b10529a
|
Undo of earlier move of TestBlas::kTransposes constant to fix MSVC 2013 compilation
|
2017-06-27 20:59:28 +02:00 |
|
Cedric Nugteren
|
ce528a9d39
|
Fixed and suppresses several warnings for MSVC
|
2017-06-26 21:38:04 +02:00 |
|
Cedric Nugteren
|
a823edb65f
|
Reduced optimization level for the (non-performance critical) host-code to speed-up compilation
|
2017-06-26 21:36:56 +02:00 |
|
Cedric Nugteren
|
19504ed609
|
Moved static variable declarations from .cpp to .hpp to resolve some Clang warnings
|
2017-06-25 20:59:22 +02:00 |
|
Cedric Nugteren
|
b8df03e5bc
|
Added CLBlast paper and presentation references in README
|
2017-06-25 20:45:14 +02:00 |
|
Cedric Nugteren
|
1a8ed48a35
|
Fixed some Clang and MSVC warnings
|
2017-06-25 11:50:36 +02:00 |
|
Cedric Nugteren
|
7eab65b699
|
Merge branch 'database_compilation_speed'
|
2017-06-25 10:16:52 +02:00 |
|
Cedric Nugteren
|
615a7fdc81
|
Fixes some compilation issues related to the database structure change
|
2017-06-21 23:07:47 +02:00 |
|
Cedric Nugteren
|
e44feb8576
|
Changed the structure of the database to reduce compilation time and save memory
|
2017-06-20 21:19:26 +02:00 |
|
Cedric Nugteren
|
48f2682eb7
|
Added tuning results for the Core i7-920 CPU
|
2017-06-18 20:53:59 +02:00 |
|
Cedric Nugteren
|
3070b502b5
|
Fixed an overflow bug on 32-bit systems when chosing a GEMM kernel
|
2017-06-18 20:51:11 +02:00 |
|
Cedric Nugteren
|
33ed1e5a06
|
Added tuning results for GeForce GT 650M (thanks to bzcheeseman)
|
2017-06-01 22:52:08 +02:00 |
|
Cedric Nugteren
|
f57e209aab
|
Merge pull request #158 from CNugteren/msvc_compilation_fixes
MSVC compilation fixes
|
2017-05-27 17:53:30 +02:00 |
|
Cedric Nugteren
|
4e04008729
|
Update to AppVeyor because of changed Khronos repository (9)
|
2017-05-27 17:39:36 +02:00 |
|
Cedric Nugteren
|
7827cfbe4a
|
Update to AppVeyor because of changed Khronos repository (8)
|
2017-05-27 17:33:47 +02:00 |
|
Cedric Nugteren
|
9ae6f174d9
|
Update to AppVeyor because of changed Khronos repository (7)
|
2017-05-27 17:30:30 +02:00 |
|
Cedric Nugteren
|
bb37bd0814
|
Update to AppVeyor because of changed Khronos repository (6)
|
2017-05-27 17:17:10 +02:00 |
|
Cedric Nugteren
|
53d739129e
|
Update to AppVeyor because of changed Khronos repository (5)
|
2017-05-27 17:11:22 +02:00 |
|
Cedric Nugteren
|
f7a822110c
|
Update to AppVeyor because of changed Khronos repository (4)
|
2017-05-27 17:06:09 +02:00 |
|
Cedric Nugteren
|
3bca9f85d2
|
Update to AppVeyor because of changed Khronos repository (3)
|
2017-05-27 17:01:11 +02:00 |
|
Cedric Nugteren
|
70188686f2
|
Merge pull request #157 from kpot/improved_caching
Fixes inability to run GEMM on multiple identical GPUs (issue #155)
|
2017-05-27 09:47:25 +02:00 |
|
Kirill Mavreshko
|
64ba590279
|
Fixed comment decribing the order of program cache fields
|
2017-05-27 10:30:09 +05:00 |
|
Cedric Nugteren
|
01de4b5413
|
Update to AppVeyor because of changed Khronos repository (2)
|
2017-05-26 22:20:04 +02:00 |
|