Commit graph

126 commits

Author SHA1 Message Date
Cedric Nugteren 43f4f02399 Added an initial version of contributing guidelines 2016-10-23 16:56:51 +02:00
Cedric Nugteren c925fe463f Added tuning results for the AMD Tonga GPU 2016-10-22 16:25:31 +02:00
Cedric Nugteren c8d0e41e84 Added the possibility to supply the env-variable CLBLAST_TEST_ARGUMENTS to specify options for the make alltest or ctest targets 2016-10-20 23:05:16 +02:00
Cedric Nugteren 53deed298f Added documentation and minor refactoring for the recent support of static library compilation 2016-10-15 17:11:08 +02:00
Cedric Nugteren ebb505b783 Added tuning results for Intel HD Graphics IvyBridge GPU 2016-10-13 12:18:28 +02:00
Cedric Nugteren 8a9d3cdf37 Added support for compiling the library, the client, and the samples under MSVC 2013 2016-10-10 22:45:39 +02:00
Cedric Nugteren d59e5c570b Added an option to run tuned kernels multiple times to average execution times; requires CLTune 2.5.0 2016-09-27 21:03:24 +02:00
Cedric Nugteren b1929d8ce7 It is now possible to set the OpenCL compiler options through an environmental variable 2016-09-21 21:22:16 +02:00
Marco Hutter 9b0f6238b3 Fixed link in README.md
The GitHub link could be https://github.com/gpu
(without "s"), but the website should be OK, too
2016-09-20 18:03:57 +02:00
Cedric Nugteren 4b94afda94 Updated to version 0.9.0 2016-09-13 19:20:39 +02:00
Cedric Nugteren 48ab0428cb Renamed the DEFAULT_DEVICE and DEFAULT_PLATFORM env variables to be in line with recent usages of CLBLAST_DEVICE and CLBLAST_PLATFORM 2016-09-13 19:08:49 +02:00
Cedric Nugteren 521bf6cdfc Added tuning results for Intel Broadwell 5500 GT2 GPU 2016-09-03 16:43:23 +02:00
Cedric Nugteren 7eeef74338 Merge branch 'development' of github.com:CNugteren/CLBlast into development
Conflicts:
	README.md
2016-08-20 12:59:21 +02:00
Cedric Nugteren 6eca53ee23 Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into dvasschemacq-master
Conflicts:
	src/kernels/level1/xaxpy.opencl
	src/kernels/level2/xgemv.opencl
	src/kernels/level2/xgemv_fast.opencl
	src/kernels/level2/xger.opencl
	src/kernels/level2/xher.opencl
	src/kernels/level2/xher2.opencl
	src/kernels/level3/xgemm_part2.opencl
2016-08-20 12:50:31 +02:00
Cedric Nugteren 35623cd98d Minor update regarding the previous CMake export/install target changes 2016-07-28 20:45:09 +02:00
Cedric Nugteren 57f09178d8 Added tuning results for AMD Oland and for Intel Graphics HD 530 2016-07-10 11:46:44 +02:00
Cedric Nugteren 27854070b4 Added a VERBOSE mode to debug performance: now prints details about compilation and kernel execution to screen 2016-07-06 21:50:12 +02:00
Cedric Nugteren 9683b50c55 Added tuning results for GTX670, GTX750, and GTX1070 (thanks to gcp) 2016-07-03 20:30:47 +02:00
Cedric Nugteren 9171f1c160 Updated the README in various places 2016-06-27 17:28:48 +02:00
Cedric Nugteren 5557a6ae81 Added vcvarsall to AppVeyor and added AppVeyor icons to README 2016-06-27 14:10:56 +02:00
Cedric Nugteren 7eeb790824 Added Appveyor Windows CI support 2016-06-27 12:47:39 +02:00
Cedric Nugteren 5f8886339a Increased coverage of Travis CI automatic builds 2016-06-27 12:16:12 +02:00
Cedric Nugteren 69beca90f4 Moved the performance graph scripts to the 'scripts' subfolder 2016-06-27 11:51:57 +02:00
Cedric Nugteren 66908ef5cd Added tuning results for 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile' (thanks to OursDesCavernes) 2016-06-19 14:59:50 +02:00
Cedric Nugteren 61203453aa Renamed all C++ source files to .cpp to match the .hpp extension better 2016-06-19 13:55:49 +02:00
Cedric Nugteren 52ccaf5b25 Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and/or transposing 2016-06-16 18:07:46 +02:00
Cedric Nugteren 6d6b030053 Made the CPU BLAS library the default reference to test against in favor of clBLAS 2016-06-08 09:21:39 +02:00
Cedric Nugteren 137d1d8708 Added tuning parameters for 'GRID K520' and 'HD Graphics Skylake ULT GT2' 2016-06-01 09:39:33 +02:00
Cedric Nugteren 305bf16c4c Separated the performance tests (clients) from the correctness tests in CMake 2016-05-30 16:38:26 +02:00
Cedric Nugteren 9f87455070 Added level-3 half-precision routines HGEMM/HSYMM/HSYRK/HSYR2K/HTRMM 2016-05-25 13:29:53 +02:00
Cedric Nugteren ac1575056e Added proper argument handling and displaying for half-precision data-types 2016-05-24 14:06:16 +02:00
Cedric Nugteren ae7d705d6f Updated README with information on half-precision support 2016-05-23 19:23:46 +02:00
Cedric Nugteren 489c5d76cf Merged in latest changes from 0.7.1 release 2016-05-18 21:32:56 +02:00
Cedric Nugteren 1c72d225c5 Fixed links in the README 2016-05-10 21:03:51 +02:00
Cedric Nugteren c5730c8b43 Updated to version 0.7.0 2016-05-08 20:29:41 +02:00
Cedric Nugteren 6c9e08c5e2 Added an option to the tests to control whether to test against clBLAS or a CPU BLAS library 2016-05-07 12:22:06 +02:00
Cedric Nugteren 435729a43e Added tuning results for AMD Hawaii (R9 290X) 2016-05-02 20:20:23 +02:00
Cedric Nugteren e113ff0852 Added non-aboslute minimum counter-part IxMIN of the BLAS routine IxAMAX 2016-04-30 09:49:39 +02:00
Cedric Nugteren d7ddbdeb1f Added non-absolute counter-parts xSUM and IxMAX of the BLAS routines xASUM and IxAMAX 2016-04-27 18:07:30 +02:00
cnugteren 16a048f1ac Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routines 2016-04-20 22:12:51 -06:00
cnugteren 8be99de82d Added support for the SASUM/DASUM/ScASUM/DzASUM routines 2016-04-14 19:58:26 -06:00
cnugteren 1d3d38a261 Events are now properly implemented using event waiting list and asking the user to wait for event completion 2016-04-09 22:22:24 -06:00
cnugteren c4ab9bda63 Updated the documentation in light of the support for a reference CPU BLAS library 2016-04-03 16:07:25 -07:00
cnugteren 8217b01702 Updated the documentation 2016-03-31 20:20:32 -07:00
Cedric Nugteren de7e68e872 Updated the README file 2016-03-13 10:48:42 +01:00
Cedric Nugteren 306bf67660 Added preliminary support for xHPR2 and xSPR2 routines 2016-03-06 15:48:11 +01:00
Cedric Nugteren fa79720557 Added tuning results for Intel Iris Pro and AMD R9 M370X 2016-02-28 16:47:52 +01:00
Cedric Nugteren e3545215a5 Added support for xHER, xHPR, xSYR, and xSPR routines 2016-02-28 14:16:48 +01:00
Cedric Nugteren 6dc44da07b Added support for xGERU and xGERC routines 2016-02-20 14:15:41 +01:00
Cedric Nugteren 6f4b34f813 Added tuning parameters for various devices using the new database script 2016-02-07 16:41:09 +01:00
Cedric Nugteren 44fb40e5c4 Prepared for MSVC support 2016-01-30 11:54:29 +01:00
CNugteren 2b56c2c603 Added TRMV/TBMV/TPMV routines 2015-09-26 16:58:03 +02:00
CNugteren de6547a92b Added SBMV and SPMV routines 2015-09-19 18:01:19 +02:00
CNugteren 80da67d28b Added the HPMV routine 2015-09-19 17:40:38 +02:00
CNugteren aebd156869 Added the HBMV routine 2015-09-19 11:11:34 +02:00
CNugteren 4507ba4997 Added first version of banded matrix-vector multiplication 2015-09-18 15:25:20 +02:00
CNugteren 224c967584 Removed routines from the table which are not supported by clBLAS 2015-09-14 17:02:33 +02:00
CNugteren a2e726d3bd Added xDOT/xDOTU/xDOTC dot-product routines 2015-09-14 16:57:00 +02:00
CNugteren ff0c54c386 Added the XSWAP, XSCAL and XCOPY level-1 routines 2015-08-22 17:11:20 +02:00
CNugteren ff1a670e88 Updated the documentation 2015-08-22 12:40:18 +02:00
Cedric Nugteren 85bd783e0d Merge pull request #22 from CNugteren/travis
Added Travis continuous integration
2015-08-19 09:34:01 +02:00
CNugteren e806bc1ff0 Added Travis build-status to the README 2015-08-19 09:29:54 +02:00
CNugteren 4242f90215 Added the plain C API 2015-08-13 18:00:09 +02:00
CNugteren c52c5f3d35 Added HEMV and SYMV 2015-07-31 17:41:10 +02:00
CNugteren a27ce11c69 Updated documentation reflecting removal of clBLAS sources 2015-07-31 11:15:48 +02:00
CNugteren a76dc2f09c Updated the docs to reflect the performance improvements 2015-07-24 08:16:41 +02:00
CNugteren c920400261 Added HEMM, HERK, HER2K, and TRMM 2015-07-12 15:14:35 +02:00
CNugteren e27e339ebf Replaced crosses with tickmarks 2015-06-26 17:43:17 +02:00
CNugteren 7c8d16147a Added the SYR2K routine, tester, and client 2015-06-26 08:12:56 +02:00
CNugteren 3de4471afe Added the SYRK routine 2015-06-24 07:52:19 +02:00
CNugteren c365614eb5 More detailed test passed/skipped/failure reporting 2015-06-20 16:43:50 +02:00
CNugteren 8b2dbdba98 Updated with conjugate transpose and CGEMM/ZGEMM CSYMM/ZSYMM 2015-06-17 07:12:45 +02:00
CNugteren f925d47dad Added GEMV to changelog and readme 2015-06-15 08:41:37 +02:00
CNugteren bc5a341dfe Initial commit of preview version 2015-05-30 12:30:43 +02:00
CNugteren c7b054ea67 Initial commit of preview version 2015-05-30 11:32:02 +02:00
Cedric Nugteren 5c9f306ab5 Initial commit 2015-05-30 10:50:51 +02:00