Commit graph

986 commits

Author SHA1 Message Date
Cedric Nugteren b7473f50df Added status badges for correctness tests; updated list of contributors; fixed minor typos 2017-07-24 20:14:47 +02:00
Cedric Nugteren 55861c40ff Merge branch 'relax_gemmbatched_ld_requirements' 2017-07-23 21:04:17 +02:00
mcian 473e814718 Code refactoring 2017-07-23 14:48:13 +02:00
Cedric Nugteren 2d52f9b1d3 Merge pull request #176 from CNugteren/inline_keyword_optional
Made the inline keyword in kernels optional
2017-07-22 10:44:08 +02:00
Cedric Nugteren 1b4959a16a Merge pull request #175 from mcian/Arm_Threshold
Add new threshold for ARM
2017-07-19 19:27:32 +02:00
mcian a36283aaec Add new threshold for ARM 2017-07-17 12:20:46 +02:00
mcian 8131e68664 Add PSO parameters support and search strategy selection from command line 2017-07-17 12:00:25 +02:00
Cedric Nugteren 97bcf77d4b First step towards supporting im2col in the test infrastructure 2017-07-16 22:33:49 +02:00
Cedric Nugteren de9ed9d4ea Fixed batched tests when testing for invalid sizes against clBLAS 2017-07-12 21:54:16 +02:00
Cedric Nugteren f77b48692b Relaxed requirement on a_ld and b_ld for batched GEMM 2017-07-12 21:53:39 +02:00
Cedric Nugteren f2477f6636 Removed spurious warning for Clang < 3.9 2017-07-12 20:58:31 +02:00
Cedric Nugteren c71362b13d Merge pull request #172 from CNugteren/msvc_improvements
Windows & MSVC improvements
2017-07-09 21:14:52 +02:00
Cedric Nugteren d4c8a7c8b0 Changed printf-statements with %zu into std::cout to fix MSVC 2013 compatibility 2017-07-09 20:19:08 +02:00
Cedric Nugteren 4b415bdf3c Disabled UNIX-style terminal color printing under Windows 2017-07-09 20:04:13 +02:00
Cedric Nugteren 442c31dd50 Made the inline keyword in kernels optional currently only enabled for NVIDIA and ARM GPUs 2017-07-08 17:12:16 +02:00
Cedric Nugteren 84ec50e29d Added interface and stubs for the im2col routine 2017-07-02 12:10:22 +02:00
Cedric Nugteren 75c0e861b8 Merge branch 'gemm_direct_bug' 2017-07-01 14:44:29 +02:00
Cedric Nugteren 4cf516cfec Fixed an if-statement in the direct GEMM kernel causing a bug with specific sets of input parameters 2017-06-30 21:57:41 +02:00
Cedric Nugteren 52881f3864 Added batched GEMM example program 2017-06-29 21:15:25 +02:00
Cedric Nugteren 4e51b1e1f8 Moved and inlined some static member variables and disabled spurious clang warnings 2017-06-27 21:05:16 +02:00
Cedric Nugteren e60b10529a Undo of earlier move of TestBlas::kTransposes constant to fix MSVC 2013 compilation 2017-06-27 20:59:28 +02:00
Cedric Nugteren ce528a9d39 Fixed and suppresses several warnings for MSVC 2017-06-26 21:38:04 +02:00
Cedric Nugteren a823edb65f Reduced optimization level for the (non-performance critical) host-code to speed-up compilation 2017-06-26 21:36:56 +02:00
Cedric Nugteren 19504ed609 Moved static variable declarations from .cpp to .hpp to resolve some Clang warnings 2017-06-25 20:59:22 +02:00
Cedric Nugteren b8df03e5bc Added CLBlast paper and presentation references in README 2017-06-25 20:45:14 +02:00
Cedric Nugteren 1a8ed48a35 Fixed some Clang and MSVC warnings 2017-06-25 11:50:36 +02:00
Cedric Nugteren 7eab65b699 Merge branch 'database_compilation_speed' 2017-06-25 10:16:52 +02:00
Cedric Nugteren 615a7fdc81 Fixes some compilation issues related to the database structure change 2017-06-21 23:07:47 +02:00
Cedric Nugteren e44feb8576 Changed the structure of the database to reduce compilation time and save memory 2017-06-20 21:19:26 +02:00
Cedric Nugteren 48f2682eb7 Added tuning results for the Core i7-920 CPU 2017-06-18 20:53:59 +02:00
Cedric Nugteren 3070b502b5 Fixed an overflow bug on 32-bit systems when chosing a GEMM kernel 2017-06-18 20:51:11 +02:00
Cedric Nugteren 33ed1e5a06 Added tuning results for GeForce GT 650M (thanks to bzcheeseman) 2017-06-01 22:52:08 +02:00
Cedric Nugteren f57e209aab Merge pull request #158 from CNugteren/msvc_compilation_fixes
MSVC compilation fixes
2017-05-27 17:53:30 +02:00
Cedric Nugteren 4e04008729 Update to AppVeyor because of changed Khronos repository (9) 2017-05-27 17:39:36 +02:00
Cedric Nugteren 7827cfbe4a Update to AppVeyor because of changed Khronos repository (8) 2017-05-27 17:33:47 +02:00
Cedric Nugteren 9ae6f174d9 Update to AppVeyor because of changed Khronos repository (7) 2017-05-27 17:30:30 +02:00
Cedric Nugteren bb37bd0814 Update to AppVeyor because of changed Khronos repository (6) 2017-05-27 17:17:10 +02:00
Cedric Nugteren 53d739129e Update to AppVeyor because of changed Khronos repository (5) 2017-05-27 17:11:22 +02:00
Cedric Nugteren f7a822110c Update to AppVeyor because of changed Khronos repository (4) 2017-05-27 17:06:09 +02:00
Cedric Nugteren 3bca9f85d2 Update to AppVeyor because of changed Khronos repository (3) 2017-05-27 17:01:11 +02:00
Cedric Nugteren 70188686f2 Merge pull request #157 from kpot/improved_caching
Fixes inability to run GEMM on multiple identical GPUs (issue #155)
2017-05-27 09:47:25 +02:00
Kirill Mavreshko 64ba590279 Fixed comment decribing the order of program cache fields 2017-05-27 10:30:09 +05:00
Cedric Nugteren 01de4b5413 Update to AppVeyor because of changed Khronos repository (2) 2017-05-26 22:20:04 +02:00
Cedric Nugteren e8b6f01e04 Update to AppVeyor because of changed Khronos repository 2017-05-26 22:12:02 +02:00
Cedric Nugteren f7a16d427c Fixed a compilation issue under MSVC 2013 2017-05-26 22:10:56 +02:00
Kirill Mavreshko 628e1e8cce Fixes inability to run GEMM on multiple identical GPUs (issue #155) 2017-05-26 15:04:19 +05:00
Cedric Nugteren 9c703a6021 Merge pull request #156 from ctuning/master
changing "wb" to "w" when saving json file (text mode)
2017-05-24 20:18:41 +02:00
Grigori Fursin 35e2e6c3a4 changing "wb" to "w" when saving json file (text mode) - compatibility for Python 3 2017-05-24 15:08:34 +02:00
Cedric Nugteren 953a5a9c22 Fixed a minor compilation issue of a sample with GCC 4.8 2017-05-15 22:14:17 +02:00
Cedric Nugteren 8400ee3a09 Fixed an TRSM issue caused by incorrect block size calculation 2017-05-15 22:04:55 +02:00