Commit graph

675 commits

Author SHA1 Message Date
Cedric Nugteren 115af8c78e Updated AppVeyor script to fix an issue with changes in the latest AppVeyor servers 2016-09-25 10:44:31 +02:00
Cedric Nugteren 8a5ce05022 Fix another issue with the packaging in the AppVeyor script 2016-09-25 10:32:12 +02:00
Cedric Nugteren 08abb7dfa4 Fix an issue with the packaging in the AppVeyor script 2016-09-25 10:20:47 +02:00
Cedric Nugteren a594067758 Updated AppVeyor script to fix an issue with changes in the latest AppVeyor servers 2016-09-25 10:10:42 +02:00
Cedric Nugteren c712fd4cb1 Merge pull request #101 from dividiti/add_ref_includes_to_test_correctness_common
Add path to ref library header when building tests.
2016-09-24 15:26:08 +02:00
Anton Lokhmotov 750f185ba9 Add path to ref library header when building tests. 2016-09-24 11:46:34 +00:00
Cedric Nugteren d595a8ed7e Fixed a bug waiting for an invalid event in case of a non-succesfull CLBlast call in the tests and samples 2016-09-22 20:47:22 +02:00
Cedric Nugteren 6aa652d6ea Merge branch 'development' into gemm_direct 2016-09-21 21:32:18 +02:00
Cedric Nugteren b1929d8ce7 It is now possible to set the OpenCL compiler options through an environmental variable 2016-09-21 21:22:16 +02:00
Cedric Nugteren 63003a1429 Merge branch 'master' into development 2016-09-21 20:57:23 +02:00
Cedric Nugteren d13a98272b Merge pull request #100 from gpu/master
Fixed link in README.md
2016-09-20 21:47:15 +02:00
Marco Hutter 9b0f6238b3 Fixed link in README.md
The GitHub link could be https://github.com/gpu
(without "s"), but the website should be OK, too
2016-09-20 18:03:57 +02:00
Cedric Nugteren f07ac22f5b Merge pull request #99 from CNugteren/development
Update to version 0.9.0
2016-09-13 21:14:51 +02:00
Cedric Nugteren 4b94afda94 Updated to version 0.9.0 2016-09-13 19:20:39 +02:00
Cedric Nugteren 48ab0428cb Renamed the DEFAULT_DEVICE and DEFAULT_PLATFORM env variables to be in line with recent usages of CLBLAST_DEVICE and CLBLAST_PLATFORM 2016-09-13 19:08:49 +02:00
Cedric Nugteren d7305346ca Merge pull request #98 from intelfx/no-ignored-attributes
CMakeLists.txt: use -Wno-ignored-attributes to silence unfixable warnings
2016-09-13 17:58:12 +02:00
Ivan Shapovalov 9095537a6a CMakeLists.txt: use -Wno-ignored-attributes to silence unfixable warnings 2016-09-13 16:12:30 +03:00
Cedric Nugteren 4ce584a014 Split the XGEMM kernel further up: now in 3 parts. This is done because MSVC can't handle long strings 2016-09-12 22:13:16 +02:00
Cedric Nugteren 9fb7a0efe1 Merge branch 'database_rewrite' into development 2016-09-12 20:16:18 +02:00
Cedric Nugteren aa3dffe356 Added XgemvFastRot and Xgemm 16-bit tuning results: just defaults which are now automatically taken from 32-bit if there are no entries at all 2016-09-12 20:13:38 +02:00
Cedric Nugteren b5a67f86ec Complete re-write of the database script. Changed Pandas for the much faster and convienient plain JSON/dict data-type 2016-09-11 21:29:28 +02:00
Cedric Nugteren 94163970ae Merge branch 'xgemm_tuner_exhaustive' into development 2016-09-10 14:01:21 +02:00
Cedric Nugteren e21f32bc99 Updated database based on exhaustive tuning results for GEMM for the R9 M370X GPU 2016-09-10 14:00:43 +02:00
Cedric Nugteren 3daba70997 Updated the database script to remove duplicate entries: keeps only the best-performing cases for a specific parameters combination 2016-09-10 11:12:09 +02:00
Cedric Nugteren 55038d3c91 Split GEMM tuning in two parts: a small set of tuning parameters which is explored exhaustively and a larger set which is explored randomly 2016-09-06 20:30:06 +02:00
Cedric Nugteren a2f8350703 Refactored the Python C++ generator script; now confirms to the PEP8 styleguide 2016-09-04 21:26:30 +02:00
Cedric Nugteren b30b26b89e The GEMM kernel no longer adds beta*C in case beta is zero; this would cause problems if C contains NaNs 2016-09-04 17:21:16 +02:00
Cedric Nugteren 521bf6cdfc Added tuning results for Intel Broadwell 5500 GT2 GPU 2016-09-03 16:43:23 +02:00
Cedric Nugteren 19574b2519 Updated tuning results for Haswell GT2 Mobile GPU; fixed database script to handle duplicate entries of different runs 2016-09-03 12:45:11 +02:00
Cedric Nugteren 478fb089d5 Merge pull request #93 from intelfx/test-read-environment
test/correctness: read platform and device from environment
2016-08-27 10:16:34 +02:00
Ivan Shapovalov ea43936e94 test/correctness: read platform and device from environment
Support passing environment variables CLBLAST_PLATFORM and CLBLAST_DEVICE
instead of -platform and -device arguments to test executables.

This is for `ctest`.
2016-08-27 05:37:26 +03:00
Cedric Nugteren 8d6a6a5bbf Merge branch 'database_defaults' into development 2016-08-22 19:31:36 +02:00
Cedric Nugteren 0c0f0ac7f9 Also changed the default-default for unknown device types to use the same method as for known device groups 2016-08-21 20:35:20 +02:00
Cedric Nugteren 84db8958d1 Increased the ratio of GEMM tuning results to explore; reduced the tuning search space to have a better chance to evaluate more likely parameter combinations 2016-08-21 20:28:02 +02:00
Cedric Nugteren 00979faab4 Updated the changelog; refactored the database-get-bests code a bit 2016-08-21 20:16:06 +02:00
Cedric Nugteren 7eeef74338 Merge branch 'development' of github.com:CNugteren/CLBlast into development
Conflicts:
	README.md
2016-08-20 12:59:21 +02:00
Cedric Nugteren ce9ba27450 Merge branch 'dvasschemacq-master' into development 2016-08-20 12:51:16 +02:00
Cedric Nugteren 6eca53ee23 Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into dvasschemacq-master
Conflicts:
	src/kernels/level1/xaxpy.opencl
	src/kernels/level2/xgemv.opencl
	src/kernels/level2/xgemv_fast.opencl
	src/kernels/level2/xger.opencl
	src/kernels/level2/xher.opencl
	src/kernels/level2/xher2.opencl
	src/kernels/level3/xgemm_part2.opencl
2016-08-20 12:50:31 +02:00
D. Van Assche 57f1aa7685 Adapt opencl files for 1.1 OpenCL
In OpenCL 1.1 __kernel has to be before __attribute__, at least with
Vivante compiler.
2016-08-18 17:33:13 +02:00
Cedric Nugteren 7d5631b7e4 Updated the database script to calculate the relative best performance of tuning results common for a device/vendor type 2016-08-15 21:01:07 +02:00
Cedric Nugteren 7da6492b36 Improved the speed of the new common-best defaults method for the database generation 2016-08-09 21:06:04 +02:00
Cedric Nugteren 3f5401d4c8 Added a first version of the database's common-best default calculation 2016-08-07 16:25:38 +02:00
Cedric Nugteren 35623cd98d Minor update regarding the previous CMake export/install target changes 2016-07-28 20:45:09 +02:00
Cedric Nugteren c3712f5b36 Merge pull request #86 from intelfx/cmake
CMakeLists.txt: provide a find_package() config for dependent projects
2016-07-28 20:17:13 +02:00
Ivan Shapovalov 227374deba .appveyor.yml: move {OPENCL,CLBLAST}_ROOT out of source tree
Reasoning is the same as in previous commit: CMake does not like having
OpenCL header path inside of the source tree. CLBLAST_ROOT is moved for
uniformity.
2016-07-28 19:09:30 +03:00
Ivan Shapovalov 6c11fdc12c .travis.yml: use OpenCL ICD Loader and headers shipped by distro
Using our own headers causes problems with CMake which does not like having
OpenCL header path inside of the source tree. While at it, use distro's
universal OpenCL loader as well.
2016-07-28 19:09:29 +03:00
Ivan Shapovalov b5d7b58393 CMakeLists.txt: use target_include_directories() 2016-07-28 19:09:29 +03:00
Ivan Shapovalov 570cbcffa7 CMakeLists.txt: provide a find_package() config for dependent projects 2016-07-28 19:09:29 +03:00
Cedric Nugteren 5004a435ff Fixed issues related to the recent changes in the Xgemm infrastructure 2016-07-26 20:59:59 +02:00
Cedric Nugteren 5053f6ebc6 Merge branch 'development' into gemm_direct 2016-07-26 20:53:31 +02:00