Commit graph

126 commits

Author SHA1 Message Date
Cedric Nugteren e81eb4f6d4 Added a note that the ArrayFire Jenkins servers are down, being switched to buildbot 2017-12-24 11:32:31 +01:00
Cedric Nugteren 0ee81e27b9 Added tuning results for Apple AMD Radeon Pro 580 2017-12-20 19:59:31 +01:00
Cedric Nugteren 35e2b3ed5b Updated the known issues 2017-12-16 12:11:15 +01:00
Cedric Nugteren abb4d5ab32 Added tuning results for ARM Mali T760 GPU 2017-11-24 21:16:54 +01:00
Cedric Nugteren d9cf206979 Removed dependency on CLTune 2017-11-16 21:28:36 +01:00
Cedric Nugteren c41d219ea4 Added tuning results for the GeForce GTX750Ti 2017-11-09 21:19:21 +01:00
Cedric Nugteren 5d5e3f93bc Updated to CLBlast version 1.2.0 2017-11-08 21:30:06 +01:00
Cedric Nugteren d24138808b Fixed an FP16 issue in the homatcopy test; added a comment about improper testing of integer returning functions for FP16 2017-11-08 21:20:07 +01:00
Cedric Nugteren b18cc9d3f1
Merge pull request #212 from CNugteren/kernel_selection_tuner
GEMM kernel selection tuner
2017-11-07 22:20:13 +01:00
Cedric Nugteren 9b0a435fb0 Integrated the GEMM routine tuner for kernel selection; added first tuning results 2017-11-02 21:47:14 +01:00
Cedric Nugteren f24d611e57 Made it possible to compile the CLBlast performance clients for Android with the NDK 2017-10-29 13:02:14 +01:00
Cedric Nugteren 319762f150 Added Android support using the GNU C++ STL library and the GCC toolchain 2017-10-29 12:07:07 +01:00
Cedric Nugteren 12b08ae491 Merge branch 'master' into android_support 2017-10-28 17:32:37 +02:00
Cedric Nugteren 5fd1f2fc60 Added first version of a roadmap 2017-10-20 18:21:31 +02:00
Cedric Nugteren 472f90501c Added tuning parameters for GeForce GTX 580, GeForce GTX 1080Ti, and Core i5-4570 2017-10-20 18:06:12 +02:00
Cedric Nugteren 03760f80eb Added CUDA API documentation 2017-10-16 21:54:42 +02:00
Cedric Nugteren f4c4674cf6 Updated to version 1.1.0 2017-09-30 17:19:17 +02:00
Cedric Nugteren 2949e156f5 Added notes for Android compilation of CLBlast 2017-09-26 21:23:53 +02:00
Cedric Nugteren a23cd8d13a Updated README with proper AMD device names; fixed device look-up for names of length 50+ 2017-09-16 21:26:38 +02:00
Cedric Nugteren 4d9d03ba51 Completed im2col implementation 2017-08-24 21:11:12 +02:00
Cedric Nugteren 18d832e149 Added tuning results for the Qualcomm Adreno 330 GPU 2017-07-30 18:18:02 +02:00
Cedric Nugteren b7473f50df Added status badges for correctness tests; updated list of contributors; fixed minor typos 2017-07-24 20:14:47 +02:00
Cedric Nugteren b8df03e5bc Added CLBlast paper and presentation references in README 2017-06-25 20:45:14 +02:00
Cedric Nugteren 48f2682eb7 Added tuning results for the Core i7-920 CPU 2017-06-18 20:53:59 +02:00
Cedric Nugteren 33ed1e5a06 Added tuning results for GeForce GT 650M (thanks to bzcheeseman) 2017-06-01 22:52:08 +02:00
Cedric Nugteren f151e56daa Added the IxAMIN routines: absolute minimum version of IxAMAX 2017-05-12 20:01:33 -07:00
Cedric Nugteren 81d9ed3946 Removed the included performance reports; README now redirects to the new external website 2017-05-12 13:18:10 -07:00
Cedric Nugteren 71933c3411 Added tuning results for the AMD Radeon Fiji GPU 2017-05-11 22:53:52 -07:00
Cedric Nugteren d67455fdb8 Fixes the build-status table in the README 2017-05-11 22:22:10 -07:00
Cedric Nugteren b0f3659121 The master branch is now the main 'development' branch 2017-05-03 19:49:15 +02:00
Cedric Nugteren e3bb58f602 Finalized support for performance testing against cuBLAS 2017-04-16 17:53:51 +02:00
Cedric Nugteren fa5c4b00b7 Replaced the R graph scripts with Python/Matplotlib benchmark scripts 2017-03-26 15:36:34 +02:00
Cedric Nugteren 7b8f8fce68 Added initial naive version of the batched GEMM routine based on the direct GEMM kernel 2017-03-11 16:02:45 +01:00
Cedric Nugteren e9ef037549 Added tuning results for the Radeon HD6750M GPU (Apple OpenCL) 2017-03-04 15:24:55 +01:00
Cedric Nugteren 4284fcd940 Updated the README documentation 2017-02-26 16:32:53 +01:00
Cedric Nugteren ea6790665d Merge branch 'development' into triangular_solvers 2017-02-26 14:51:45 +01:00
Cedric Nugteren b7310036ed Removed half-precision support from the TRSM routine; too unstable 2017-02-26 12:56:21 +01:00
Cedric Nugteren ccac957f17 Added documentation for the TRSV and TRSM routines 2017-02-25 13:02:15 +01:00
Cedric Nugteren 0643a29af5 Added tuning parameters for the AMD RX480 GPU (Ellesmere) 2017-02-18 13:59:10 +01:00
Cedric Nugteren 2e0951c6dc Fixed small typo in the documentation 2017-02-18 11:05:54 +01:00
Cedric Nugteren fef11a208c Added documentation for the OverrideParameters function 2017-02-18 11:02:57 +01:00
Cedric Nugteren dc93523204 Added tuning results for Titan X (Pascal version) 2017-02-08 21:14:38 +01:00
Cedric Nugteren 2e4f6e1609 Added tuning results for NVIDIA GTX 1080 and Intel Core i7-4790K 2017-01-19 19:42:31 +01:00
Cedric Nugteren 32b850b12b Added tuning results for the AMD Turks GPU and the Intel Core i7-2670QM CPU 2017-01-03 20:30:56 +01:00
Cedric Nugteren 2cf7d8429a Updated to version 0.10.0 2016-11-27 13:34:18 +01:00
Cedric Nugteren 39c49bf4f9 Made it possible to use the command-line environmental variables for each executable and without re-running CMake 2016-11-27 11:00:29 +01:00
Cedric Nugteren fa42befcc1 Made compilation of the Netlib CBLAS API conditional 2016-11-23 21:33:35 +01:00
Cedric Nugteren bb14a5880e Added an example and documentation for the Netlib CBLAS API 2016-10-25 20:37:33 +02:00
Cedric Nugteren 0f5bf35ebe Updated list of acknowledgments and thanks 2016-10-24 19:54:45 +02:00
Cedric Nugteren ec687afa75 Added tuning results for GeForce GTX TITAN Black 2016-10-24 19:49:10 +02:00