Commit graph

991 commits

Author SHA1 Message Date
Cedric Nugteren 5467c0cac5 Fixed a variety of warnings and an error for MSVC2013 compilation 2017-11-19 21:09:24 +01:00
Cedric Nugteren da76d7ab81
Merge pull request #216 from CNugteren/integrated_tuner
Integrated tuner
2017-11-19 20:05:15 +01:00
Cedric Nugteren defad3d1a2 Minor fix to the database script 2017-11-19 18:19:21 +01:00
Cedric Nugteren 4e0d08c3bc Added compilation timing and better compilation error reporting 2017-11-19 16:58:13 +01:00
Cedric Nugteren a3a8b44f59 Some fixed for the new auto-tuner to be compatible with the Python scripts 2017-11-19 16:31:08 +01:00
Cedric Nugteren c6690df896 Made the tuners be compiled by default 2017-11-19 14:33:25 +01:00
Cedric Nugteren 76d2b7f0b6 Revived the GEMM routine tuner; minor formatting changes 2017-11-19 12:59:52 +01:00
Cedric Nugteren 8d2f7d53aa Added a library with common tuner sources to speed-up compilation 2017-11-19 12:59:28 +01:00
Cedric Nugteren 7a54494577 Modified the kernel tuners to use the newly integrated auto-tuner 2017-11-19 12:58:41 +01:00
Cedric Nugteren 8a5a5e031e Moved some tuning functions from .hpp to .cpp 2017-11-17 20:58:36 +01:00
Cedric Nugteren f94d498a37 Moved compilation function to separate file; removed dependency of tuners of the CLBlast library 2017-11-17 20:57:46 +01:00
Cedric Nugteren d9cf206979 Removed dependency on CLTune 2017-11-16 21:28:36 +01:00
Cedric Nugteren 2b8ad70b63 Added printing of the best parameters for the new tuner 2017-11-16 21:18:29 +01:00
Cedric Nugteren 1b2b46f2f0 Added first version of integrated and re-written auto-tuner 2017-11-15 22:49:35 +01:00
Cedric Nugteren 0cd78bb6f9 Added kernel timing functionality to the utilities 2017-11-15 22:47:06 +01:00
Cedric Nugteren b337bffbaf Added exception handle with catch-all 2017-11-15 22:44:44 +01:00
Cedric Nugteren 03ebf14b97 Made the exception dispatch function optionally silent 2017-11-13 21:11:31 +01:00
Cedric Nugteren 4bac1287f2 Moved square-difference utility function for use in the tuners 2017-11-13 21:10:44 +01:00
Cedric Nugteren 677afd3b96 Factored out the creation of the OpenCL header and the program compilation 2017-11-11 16:14:43 +01:00
Cedric Nugteren c41d219ea4 Added tuning results for the GeForce GTX750Ti 2017-11-09 21:19:21 +01:00
Cedric Nugteren 5d5e3f93bc Updated to CLBlast version 1.2.0 2017-11-08 21:30:06 +01:00
Cedric Nugteren d24138808b Fixed an FP16 issue in the homatcopy test; added a comment about improper testing of integer returning functions for FP16 2017-11-08 21:20:07 +01:00
Cedric Nugteren b18cc9d3f1
Merge pull request #212 from CNugteren/kernel_selection_tuner
GEMM kernel selection tuner
2017-11-07 22:20:13 +01:00
Cedric Nugteren 6fe9916231 Updated the roadmap 2017-11-07 21:35:04 +01:00
Cedric Nugteren 3ec0be6fb8 Added various GEMM routine tuning results 2017-11-07 21:34:54 +01:00
Cedric Nugteren 33ac2b0175 Improved the way the database defaults are computed 2017-11-06 21:59:45 +01:00
Cedric Nugteren 34a33b54cf Changed GEMM routine tuner's scoring to use L2 measure instead for better averaging 2017-11-06 20:50:36 +01:00
Cedric Nugteren 9b0a435fb0 Integrated the GEMM routine tuner for kernel selection; added first tuning results 2017-11-02 21:47:14 +01:00
Cedric Nugteren 73272ab97d Fixed a bug in database compression/decompression 2017-11-02 21:19:18 +01:00
Cedric Nugteren 5c90577dfd Added collecting and printing of scores for the kernel-selection tuner 2017-10-30 20:39:21 +01:00
Cedric Nugteren 061b1c571b Merge branch 'binary_cache_platform_dependent' 2017-10-30 19:42:35 +01:00
Cedric Nugteren ac5a58cfe5 Added platform ID to the binary program cache to prevent issues with multi-platform systems 2017-10-29 20:01:30 +01:00
Cedric Nugteren 19c53f6dd0
Merge pull request #208 from CNugteren/android_support
Added Android support
2017-10-29 16:45:56 +01:00
Cedric Nugteren f24d611e57 Made it possible to compile the CLBlast performance clients for Android with the NDK 2017-10-29 13:02:14 +01:00
Cedric Nugteren 319762f150 Added Android support using the GNU C++ STL library and the GCC toolchain 2017-10-29 12:07:07 +01:00
Cedric Nugteren 12b08ae491 Merge branch 'master' into android_support 2017-10-28 17:32:37 +02:00
Cedric Nugteren 334a26eb12 Added initial version of a GEMM kernel selection tuner 2017-10-28 17:30:29 +02:00
Cedric Nugteren bd57dfa435 Moved timing function to a separate file 2017-10-28 14:12:05 +02:00
Cedric Nugteren fa6e5e67f5 Fixed a bug when using the matrix A-offset argument for the TRSM routine 2017-10-27 22:12:30 +02:00
Cedric Nugteren 449577cf07 Reduced TRSM block-size for better numerical stability 2017-10-27 22:07:43 +02:00
Cedric Nugteren 44f7fa628a Added GEMV synchronisation for the TRSV routine: similar bug as in TRSM 2017-10-27 22:01:15 +02:00
Cedric Nugteren 8579b2b494 Added a DTRSM C++ interface example 2017-10-27 21:53:19 +02:00
Cedric Nugteren e388f055f7 Fixed small bug in (unused) invert tester 2017-10-25 20:35:39 +02:00
Cedric Nugteren 8cdb5cb4a7 Updated roadmap with links to issues and status 2017-10-25 20:35:39 +02:00
Cedric Nugteren d49aae236e Fixed a bug in TRSM routine due to missing event synchronisations after GEMM calls 2017-10-25 20:35:39 +02:00
Cedric Nugteren 42ac3b4748 Merge pull request #206 from matze/use-gnuinstall-dirs
Use GNUInstallDirs to determine install paths
2017-10-23 20:03:47 +02:00
Matthias Vogelgesang 34e537a5c1 Use GNUInstallDirs to determine install paths
The GNUInstallDirs module* provides variables that match the install directories
for GNU Software and allows users to override them. Without hardcoding paths
packagers can choose library paths according to distribution policies (i.e.
lib, lib64, lib<arch>, ...).

* https://cmake.org/cmake/help/v3.0/module/GNUInstallDirs.html
2017-10-23 15:54:55 +02:00
Cedric Nugteren 5fd1f2fc60 Added first version of a roadmap 2017-10-20 18:21:31 +02:00
Cedric Nugteren 472f90501c Added tuning parameters for GeForce GTX 580, GeForce GTX 1080Ti, and Core i5-4570 2017-10-20 18:06:12 +02:00
Cedric Nugteren 42dcd8fd8a Merge pull request #204 from CNugteren/cuda_api
Cuda API to CLBlast
2017-10-20 12:07:30 +02:00