Cedric Nugteren
|
4284fcd940
|
Updated the README documentation
|
2017-02-26 16:32:53 +01:00 |
|
Cedric Nugteren
|
ea6790665d
|
Merge branch 'development' into triangular_solvers
|
2017-02-26 14:51:45 +01:00 |
|
Cedric Nugteren
|
b7310036ed
|
Removed half-precision support from the TRSM routine; too unstable
|
2017-02-26 12:56:21 +01:00 |
|
Cedric Nugteren
|
ccac957f17
|
Added documentation for the TRSV and TRSM routines
|
2017-02-25 13:02:15 +01:00 |
|
Cedric Nugteren
|
0643a29af5
|
Added tuning parameters for the AMD RX480 GPU (Ellesmere)
|
2017-02-18 13:59:10 +01:00 |
|
Cedric Nugteren
|
2e0951c6dc
|
Fixed small typo in the documentation
|
2017-02-18 11:05:54 +01:00 |
|
Cedric Nugteren
|
fef11a208c
|
Added documentation for the OverrideParameters function
|
2017-02-18 11:02:57 +01:00 |
|
Cedric Nugteren
|
dc93523204
|
Added tuning results for Titan X (Pascal version)
|
2017-02-08 21:14:38 +01:00 |
|
Cedric Nugteren
|
2e4f6e1609
|
Added tuning results for NVIDIA GTX 1080 and Intel Core i7-4790K
|
2017-01-19 19:42:31 +01:00 |
|
Cedric Nugteren
|
32b850b12b
|
Added tuning results for the AMD Turks GPU and the Intel Core i7-2670QM CPU
|
2017-01-03 20:30:56 +01:00 |
|
Cedric Nugteren
|
2cf7d8429a
|
Updated to version 0.10.0
|
2016-11-27 13:34:18 +01:00 |
|
Cedric Nugteren
|
39c49bf4f9
|
Made it possible to use the command-line environmental variables for each executable and without re-running CMake
|
2016-11-27 11:00:29 +01:00 |
|
Cedric Nugteren
|
fa42befcc1
|
Made compilation of the Netlib CBLAS API conditional
|
2016-11-23 21:33:35 +01:00 |
|
Cedric Nugteren
|
bb14a5880e
|
Added an example and documentation for the Netlib CBLAS API
|
2016-10-25 20:37:33 +02:00 |
|
Cedric Nugteren
|
0f5bf35ebe
|
Updated list of acknowledgments and thanks
|
2016-10-24 19:54:45 +02:00 |
|
Cedric Nugteren
|
ec687afa75
|
Added tuning results for GeForce GTX TITAN Black
|
2016-10-24 19:49:10 +02:00 |
|
Cedric Nugteren
|
43f4f02399
|
Added an initial version of contributing guidelines
|
2016-10-23 16:56:51 +02:00 |
|
Cedric Nugteren
|
c925fe463f
|
Added tuning results for the AMD Tonga GPU
|
2016-10-22 16:25:31 +02:00 |
|
Cedric Nugteren
|
c8d0e41e84
|
Added the possibility to supply the env-variable CLBLAST_TEST_ARGUMENTS to specify options for the make alltest or ctest targets
|
2016-10-20 23:05:16 +02:00 |
|
Cedric Nugteren
|
53deed298f
|
Added documentation and minor refactoring for the recent support of static library compilation
|
2016-10-15 17:11:08 +02:00 |
|
Cedric Nugteren
|
ebb505b783
|
Added tuning results for Intel HD Graphics IvyBridge GPU
|
2016-10-13 12:18:28 +02:00 |
|
Cedric Nugteren
|
8a9d3cdf37
|
Added support for compiling the library, the client, and the samples under MSVC 2013
|
2016-10-10 22:45:39 +02:00 |
|
Cedric Nugteren
|
d59e5c570b
|
Added an option to run tuned kernels multiple times to average execution times; requires CLTune 2.5.0
|
2016-09-27 21:03:24 +02:00 |
|
Cedric Nugteren
|
b1929d8ce7
|
It is now possible to set the OpenCL compiler options through an environmental variable
|
2016-09-21 21:22:16 +02:00 |
|
Marco Hutter
|
9b0f6238b3
|
Fixed link in README.md
The GitHub link could be https://github.com/gpu
(without "s"), but the website should be OK, too
|
2016-09-20 18:03:57 +02:00 |
|
Cedric Nugteren
|
4b94afda94
|
Updated to version 0.9.0
|
2016-09-13 19:20:39 +02:00 |
|
Cedric Nugteren
|
48ab0428cb
|
Renamed the DEFAULT_DEVICE and DEFAULT_PLATFORM env variables to be in line with recent usages of CLBLAST_DEVICE and CLBLAST_PLATFORM
|
2016-09-13 19:08:49 +02:00 |
|
Cedric Nugteren
|
521bf6cdfc
|
Added tuning results for Intel Broadwell 5500 GT2 GPU
|
2016-09-03 16:43:23 +02:00 |
|
Cedric Nugteren
|
7eeef74338
|
Merge branch 'development' of github.com:CNugteren/CLBlast into development
Conflicts:
README.md
|
2016-08-20 12:59:21 +02:00 |
|
Cedric Nugteren
|
6eca53ee23
|
Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into dvasschemacq-master
Conflicts:
src/kernels/level1/xaxpy.opencl
src/kernels/level2/xgemv.opencl
src/kernels/level2/xgemv_fast.opencl
src/kernels/level2/xger.opencl
src/kernels/level2/xher.opencl
src/kernels/level2/xher2.opencl
src/kernels/level3/xgemm_part2.opencl
|
2016-08-20 12:50:31 +02:00 |
|
Cedric Nugteren
|
35623cd98d
|
Minor update regarding the previous CMake export/install target changes
|
2016-07-28 20:45:09 +02:00 |
|
Cedric Nugteren
|
57f09178d8
|
Added tuning results for AMD Oland and for Intel Graphics HD 530
|
2016-07-10 11:46:44 +02:00 |
|
Cedric Nugteren
|
27854070b4
|
Added a VERBOSE mode to debug performance: now prints details about compilation and kernel execution to screen
|
2016-07-06 21:50:12 +02:00 |
|
Cedric Nugteren
|
9683b50c55
|
Added tuning results for GTX670, GTX750, and GTX1070 (thanks to gcp)
|
2016-07-03 20:30:47 +02:00 |
|
Cedric Nugteren
|
9171f1c160
|
Updated the README in various places
|
2016-06-27 17:28:48 +02:00 |
|
Cedric Nugteren
|
5557a6ae81
|
Added vcvarsall to AppVeyor and added AppVeyor icons to README
|
2016-06-27 14:10:56 +02:00 |
|
Cedric Nugteren
|
7eeb790824
|
Added Appveyor Windows CI support
|
2016-06-27 12:47:39 +02:00 |
|
Cedric Nugteren
|
5f8886339a
|
Increased coverage of Travis CI automatic builds
|
2016-06-27 12:16:12 +02:00 |
|
Cedric Nugteren
|
69beca90f4
|
Moved the performance graph scripts to the 'scripts' subfolder
|
2016-06-27 11:51:57 +02:00 |
|
Cedric Nugteren
|
66908ef5cd
|
Added tuning results for 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile' (thanks to OursDesCavernes)
|
2016-06-19 14:59:50 +02:00 |
|
Cedric Nugteren
|
61203453aa
|
Renamed all C++ source files to .cpp to match the .hpp extension better
|
2016-06-19 13:55:49 +02:00 |
|
Cedric Nugteren
|
52ccaf5b25
|
Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and/or transposing
|
2016-06-16 18:07:46 +02:00 |
|
Cedric Nugteren
|
6d6b030053
|
Made the CPU BLAS library the default reference to test against in favor of clBLAS
|
2016-06-08 09:21:39 +02:00 |
|
Cedric Nugteren
|
137d1d8708
|
Added tuning parameters for 'GRID K520' and 'HD Graphics Skylake ULT GT2'
|
2016-06-01 09:39:33 +02:00 |
|
Cedric Nugteren
|
305bf16c4c
|
Separated the performance tests (clients) from the correctness tests in CMake
|
2016-05-30 16:38:26 +02:00 |
|
Cedric Nugteren
|
9f87455070
|
Added level-3 half-precision routines HGEMM/HSYMM/HSYRK/HSYR2K/HTRMM
|
2016-05-25 13:29:53 +02:00 |
|
Cedric Nugteren
|
ac1575056e
|
Added proper argument handling and displaying for half-precision data-types
|
2016-05-24 14:06:16 +02:00 |
|
Cedric Nugteren
|
ae7d705d6f
|
Updated README with information on half-precision support
|
2016-05-23 19:23:46 +02:00 |
|
Cedric Nugteren
|
489c5d76cf
|
Merged in latest changes from 0.7.1 release
|
2016-05-18 21:32:56 +02:00 |
|
Cedric Nugteren
|
1c72d225c5
|
Fixed links in the README
|
2016-05-10 21:03:51 +02:00 |
|