Cedric Nugteren
|
5a690f4e36
|
Prints the current pandas version and reports the minimum required version
|
2016-07-02 16:44:13 +02:00 |
|
Cedric Nugteren
|
7cf2f8c268
|
Fixed some memory leaks related to events not properly cleaned-up
|
2016-07-02 15:34:55 +02:00 |
|
Cedric Nugteren
|
b330ab0866
|
Added declspec(dllexport) to ClearCache and FillCache, and added declspec(dllimport) when not building the library
|
2016-06-30 10:49:17 +02:00 |
|
Cedric Nugteren
|
cd74aaac52
|
Updated to version 6.0 of the CLCudaAPI header
|
2016-06-29 19:42:49 +02:00 |
|
Cedric Nugteren
|
56483347e8
|
Prepared the changelog for the next release
|
2016-06-28 22:33:13 +02:00 |
|
Cedric Nugteren
|
7c13bacf12
|
Merge pull request #70 from CNugteren/development
Update to version 0.8.0
|
2016-06-28 22:32:25 +02:00 |
|
Cedric Nugteren
|
577f0ee117
|
Updated to version 0.8.0
|
2016-06-28 21:32:00 +02:00 |
|
Cedric Nugteren
|
33dddd3ff1
|
Changed the AppVeyor buildscript to use nmake instead of 'cmake --build' (2)
|
2016-06-28 20:56:49 +02:00 |
|
Cedric Nugteren
|
a003cc2f2c
|
Changed the AppVeyor buildscript to use nmake instead of 'cmake --build'
|
2016-06-28 20:48:23 +02:00 |
|
Cedric Nugteren
|
743da1b3fc
|
Fixes bug in AppVeyor with install directory (2)
|
2016-06-28 20:06:34 +02:00 |
|
Cedric Nugteren
|
88014e38bc
|
Fixes bug in AppVeyor with install directory
|
2016-06-28 18:23:32 +02:00 |
|
Cedric Nugteren
|
7c6bb6e21d
|
Added configuration for AppVeyor to keep the results of the builds as an 'artifact'
|
2016-06-28 17:58:34 +02:00 |
|
CNugteren
|
871b576c06
|
Made it possible to build the clients and tests on Windows using Visual Studio
|
2016-06-28 16:38:45 +02:00 |
|
CNugteren
|
2c031f3e1d
|
Made it possible to build the OMATCOPY test and client in case only clBLAS is present
|
2016-06-28 16:36:01 +02:00 |
|
Cedric Nugteren
|
9171f1c160
|
Updated the README in various places
|
2016-06-27 17:28:48 +02:00 |
|
Cedric Nugteren
|
76b20cfe0c
|
Fixes for the AppVeyor Windows build
|
2016-06-27 14:44:08 +02:00 |
|
Cedric Nugteren
|
5557a6ae81
|
Added vcvarsall to AppVeyor and added AppVeyor icons to README
|
2016-06-27 14:10:56 +02:00 |
|
Cedric Nugteren
|
dac99451d9
|
Fixed a bug in the Appveyor script
|
2016-06-27 13:55:16 +02:00 |
|
Cedric Nugteren
|
7eeb790824
|
Added Appveyor Windows CI support
|
2016-06-27 12:47:39 +02:00 |
|
Cedric Nugteren
|
5f8886339a
|
Increased coverage of Travis CI automatic builds
|
2016-06-27 12:16:12 +02:00 |
|
Cedric Nugteren
|
69beca90f4
|
Moved the performance graph scripts to the 'scripts' subfolder
|
2016-06-27 11:51:57 +02:00 |
|
Cedric Nugteren
|
ca386f9883
|
Added fp16 to the alltuners target
|
2016-06-27 11:46:33 +02:00 |
|
Cedric Nugteren
|
fdfbc9af13
|
Changed the symbol for error-code skipped tests to distinguish from succesfull error-code checks in the correctness tests
|
2016-06-27 11:27:54 +02:00 |
|
Cedric Nugteren
|
8f7131bd90
|
Increased the verbosity of the '-verbose' option for the correctness tests, now printing when a library is called
|
2016-06-27 11:16:30 +02:00 |
|
Cedric Nugteren
|
66908ef5cd
|
Added tuning results for 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile' (thanks to OursDesCavernes)
|
2016-06-19 14:59:50 +02:00 |
|
Cedric Nugteren
|
eab8d3cda1
|
Minor fix to the database script
|
2016-06-19 14:55:17 +02:00 |
|
Cedric Nugteren
|
395a0ef34e
|
Merge pull request #69 from CNugteren/refactoring
Refactoring of the Routine class and file-renaming
|
2016-06-19 14:03:53 +02:00 |
|
Cedric Nugteren
|
61203453aa
|
Renamed all C++ source files to .cpp to match the .hpp extension better
|
2016-06-19 13:55:49 +02:00 |
|
Cedric Nugteren
|
f726fbdc9f
|
Moved all headers into the source tree, changed headers to .hpp extension
|
2016-06-18 20:20:13 +02:00 |
|
Cedric Nugteren
|
bacb5d2bb2
|
Clean-up of the routine class, moved RunKernel to the routine/common file
|
2016-06-18 18:16:14 +02:00 |
|
Cedric Nugteren
|
7b4c0e1cf0
|
Removed the template from the Routine base-class
|
2016-06-18 14:56:55 +02:00 |
|
Cedric Nugteren
|
f9947b4d7f
|
Removed the precision argument from the routines in favor of a single templated function
|
2016-06-17 14:30:37 +02:00 |
|
Cedric Nugteren
|
536b7fe4bc
|
Removed the interface to the cache functions from the Routine class, calls them directly now
|
2016-06-17 13:57:50 +02:00 |
|
Cedric Nugteren
|
98a95c89fc
|
Moved the RunKernel and PadCopyTransposeMatrix functions out of the Routine class
|
2016-06-17 12:32:06 +02:00 |
|
Cedric Nugteren
|
520e28e7a7
|
Moved the ErrorIn function from the Routine class to the utilities header
|
2016-06-17 11:41:10 +02:00 |
|
Cedric Nugteren
|
afe8852eaa
|
Moved the test-for-valid-buffers function from the Routine class to separate functions in a separate file
|
2016-06-17 11:29:07 +02:00 |
|
Cedric Nugteren
|
52ccaf5b25
|
Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and/or transposing
|
2016-06-16 18:07:46 +02:00 |
|
Cedric Nugteren
|
39b7dbc5e3
|
Added some constness to variables related to the GEMM routines
|
2016-06-15 12:34:05 +02:00 |
|
Cedric Nugteren
|
b894611ad1
|
Re-organised the level-3 supporting kernels (copy, pad, transpose, convert) and renamed files and functions appropriately
|
2016-06-14 18:17:58 +02:00 |
|
Cedric Nugteren
|
3e78a99355
|
Moved device vendor and type checks to a common header
|
2016-06-14 14:30:22 +02:00 |
|
Cedric Nugteren
|
6e2017c67d
|
Added support for FP16 on ARM Mali-T628 (officially not supported)
|
2016-06-14 14:29:53 +02:00 |
|
Cedric Nugteren
|
995a528cec
|
Improved API documentation and added documentation for level-2 and level-3 routines
|
2016-06-13 20:17:26 +02:00 |
|
Cedric Nugteren
|
4fb8f9517c
|
Added documentation for the matrix-update level-2 family of routines
|
2016-06-10 11:16:06 +02:00 |
|
Cedric Nugteren
|
6925003e45
|
Added global memory synchronisation for better cache performance on ARM Mali GPUs
|
2016-06-08 10:13:37 +02:00 |
|
Cedric Nugteren
|
6d6b030053
|
Made the CPU BLAS library the default reference to test against in favor of clBLAS
|
2016-06-08 09:21:39 +02:00 |
|
Cedric Nugteren
|
7a7873d552
|
Fixed the RPATH settings for linking on OS X
|
2016-06-06 13:40:52 +02:00 |
|
Cedric Nugteren
|
c1895ea459
|
Made the tests for invalid buffer sizes also verbose in verbose mode
|
2016-06-06 12:20:42 +02:00 |
|
Cedric Nugteren
|
e561e3fbd5
|
Added return value to the test binaries (0: success, 1: failure), allowing it to work under CTest properly
|
2016-06-02 16:24:22 +02:00 |
|
Cedric Nugteren
|
137d1d8708
|
Added tuning parameters for 'GRID K520' and 'HD Graphics Skylake ULT GT2'
|
2016-06-01 09:39:33 +02:00 |
|
Cedric Nugteren
|
983df6a8b4
|
Made use of CMake's built-in unit testing, allowing all tests to be run using 'make test'
|
2016-05-31 20:53:55 +02:00 |
|