Commit graph

147 commits

Author SHA1 Message Date
Cedric Nugteren cf168fca70 Merge pull request #23 from CNugteren/tuner_database
Added initial version of a tuner-database
2015-08-20 08:38:18 +02:00
CNugteren 15db2bcc20 Added initial version of tuner-database Python script 2015-08-20 08:30:51 +02:00
CNugteren 07e393cce4 Added target to run all tuners 2015-08-19 19:35:56 +02:00
CNugteren 798a3b6101 Add check for supported precision to the tuners 2015-08-19 19:35:08 +02:00
CNugteren b46de22433 Moved precision tester to utilities 2015-08-19 19:34:29 +02:00
CNugteren 8a02db0746 Added precision to the JSON output 2015-08-19 11:12:42 +02:00
CNugteren cbd25bffea Added hotfix 8eeb7f721f 2015-08-19 11:12:16 +02:00
Cedric Nugteren 85bd783e0d Merge pull request #22 from CNugteren/travis
Added Travis continuous integration
2015-08-19 09:34:01 +02:00
CNugteren e806bc1ff0 Added Travis build-status to the README 2015-08-19 09:29:54 +02:00
CNugteren e239c5f852 Now using apt-get directly in Travis 2015-08-19 09:20:10 +02:00
CNugteren 4f79d13d1d Updated fglrx package in Travis 2015-08-19 09:10:45 +02:00
CNugteren 8d8dcda5bf Added OpenCL and Clang to travis 2015-08-19 09:06:59 +02:00
CNugteren 154b611546 Added GCC 4.8 and updated CMake 2015-08-18 08:25:32 +02:00
CNugteren ad4aade5d5 Added initial .travis.yml file 2015-08-18 07:22:27 +02:00
Cedric Nugteren 4f6e42d052 Merge pull request #21 from CNugteren/c_api
Added a plain C API
2015-08-13 18:02:03 +02:00
CNugteren 4242f90215 Added the plain C API 2015-08-13 18:00:09 +02:00
CNugteren 603e389545 Added all supported routines to the C API 2015-08-13 17:58:46 +02:00
CNugteren 8eeb7f721f Fixed a complex data-type bug in the transpose kernel 2015-08-13 14:33:42 +02:00
CNugteren a6c104ef20 Added SGEMM example using the C API 2015-08-13 13:47:15 +02:00
CNugteren 8617195ac5 Added initial version of C API with just one routine 2015-08-13 13:46:13 +02:00
CNugteren f85d44f602 Added argument m,n,k metadata to JSON files 2015-08-13 08:33:04 +02:00
CNugteren dbdb58c600 Refactored the tuners, added JSON output 2015-08-09 15:50:41 +02:00
Cedric Nugteren e4aa4519c2 Merge pull request #19 from CNugteren/basic_level2_routines
Level-2 routines: HEMV and SYMV
2015-08-04 08:19:42 +02:00
CNugteren 75b4d92ac3 Added distinguished names for GEMV inherited HEMV/SYMV 2015-08-04 08:15:39 +02:00
CNugteren d1a7cf18ec Abstracted loading of matrix A for GEMV kernel 2015-08-03 07:37:14 +02:00
CNugteren fc7cd434e1 Added HEMV and SYMV 2015-07-31 17:44:17 +02:00
CNugteren c52c5f3d35 Added HEMV and SYMV 2015-07-31 17:41:10 +02:00
CNugteren 938ca2707f Added HEMV routine 2015-07-31 17:35:42 +02:00
CNugteren b89517a2e7 Added SYMV routine 2015-07-31 17:13:41 +02:00
Cedric Nugteren 674f69390d Merge pull request #18 from CNugteren/correctness_test_refactoring
Refactored the correctness tests
2015-07-31 16:01:47 +02:00
CNugteren c5d5adbddd Refactored the correctness tests 2015-07-31 15:52:13 +02:00
Cedric Nugteren 6e1e7fdcaf Merge pull request #17 from CNugteren/clblas_external
Removed clBLAS sources
2015-07-31 11:20:30 +02:00
CNugteren a27ce11c69 Updated documentation reflecting removal of clBLAS sources 2015-07-31 11:15:48 +02:00
CNugteren 68044254c7 Removed clBLAS source code, now requires separate installation 2015-07-31 11:06:07 +02:00
CNugteren e4c9f4cfe5 Moved the preferred options of clBLAS (no tests) to the CLBlast CMakeLists file 2015-07-27 07:34:19 +02:00
Cedric Nugteren 1acec9c951 Merge pull request #16 from CNugteren/claduc_header
Now using the new Claduc C++11 OpenCL header
2015-07-27 07:21:20 +02:00
CNugteren f7199b831f Now using the new Claduc C++11 OpenCL header 2015-07-27 07:18:06 +02:00
CNugteren b10f4a633c Prepared the changelog for the next release 2015-07-24 20:50:00 +02:00
CNugteren efbdcd2d90 Updated to version 0.3.0 2015-07-24 08:25:32 +02:00
Cedric Nugteren 44760e7381 Merge pull request #14 from CNugteren/amd_performance
Improved performance for AMD GPUs
2015-07-24 08:21:01 +02:00
CNugteren a76dc2f09c Updated the docs to reflect the performance improvements 2015-07-24 08:16:41 +02:00
CNugteren 547b7afffc Updated the performance results, added HD7950 2015-07-23 18:25:39 +02:00
CNugteren 0273b622d3 Made the graph script robust against diagnostic system messages 2015-07-22 21:30:02 +02:00
CNugteren dd8471ba92 Set the correct name for AMD OpenCL devices 2015-07-22 19:25:06 +02:00
CNugteren 3a6bdeb79a Updated GEMM tuning results for Tahiti 2015-07-22 07:31:39 +02:00
CNugteren 4dcecfe934 Added workgroup shuffle option to transpose kernel for AMD GPUs 2015-07-22 07:31:16 +02:00
CNugteren d93efa3169 Transpose kernel now uses vectorized local memory loads and stores 2015-07-21 08:22:18 +02:00
CNugteren a0f0f6c8ce Triangular GEMM kernels are only compiled when needed 2015-07-19 16:36:12 +02:00
CNugteren 48e2e96f1b Kernel caching is now based on a routine's name 2015-07-19 16:24:14 +02:00
CNugteren 4e499a67c1 The kernel source string is now a routine's member variable 2015-07-19 13:44:37 +02:00