Commit graph

190 commits

Author SHA1 Message Date
Cedric Nugteren 276e772a2c Added first auto-generated database headers from the Python database; only K40 and Iris supported now 2016-01-30 11:43:21 +01:00
Cedric Nugteren 76c9148030 Minor improvements to the database script, including proper file paths 2016-01-24 17:56:27 +01:00
Cedric Nugteren f0b3091cdb Added Python function to compute defaults for a particular device/vendor combination 2016-01-24 17:35:31 +01:00
CNugteren 09c94b17cf Added tuning data for Tesla K40 2015-10-28 21:20:42 +01:00
CNugteren c0d469718a Now sets local memory size in xgemv tuner properly 2015-10-28 21:19:59 +01:00
CNugteren bb4e78f737 Added initial tuning database with Intel Iris data 2015-10-25 16:49:59 +01:00
CNugteren ccd1a5c7cc Updated tuning database script according to the new JSON format 2015-10-25 16:49:29 +01:00
CNugteren 179ad0666d Fixed an arguments-related bug in the GEMV tuner 2015-10-25 16:48:26 +01:00
CNugteren a2d5d7770e Moved the tuner database script to a separate folder 2015-10-25 16:27:14 +01:00
CNugteren 9bf6be8426 Added alpha and beta to tuner meta-data 2015-10-23 11:01:44 +02:00
CNugteren 3f616366bd Prepared the changelog for the next release 2015-10-17 15:57:04 +02:00
CNugteren 92404035e8 Updated to version 0.5.0 2015-10-17 15:48:13 +02:00
CNugteren afb3e64fd3 Travis now also build the development branch 2015-10-17 15:42:45 +02:00
Cedric Nugteren 653feca564 Merge pull request #28 from CNugteren/kernels_reorganization
Kernels re-organization level-3
2015-10-17 15:30:06 +02:00
CNugteren 0d4091fdfb Added guards for routine-specific level-3 pad kernels 2015-10-13 08:29:45 +02:00
CNugteren f74c9a5640 Routine names are now all default arguments defined in the header 2015-10-12 08:35:58 +02:00
CNugteren 54a8723f8c Moved level3 kernel files to a subfolder 2015-10-12 08:28:40 +02:00
Cedric Nugteren 92b4b0d1fe Merge pull request #27 from CNugteren/level2_matrix_vector
Added many level-2 matrix-vector routines
2015-09-26 17:02:34 +02:00
CNugteren 2b56c2c603 Added TRMV/TBMV/TPMV routines 2015-09-26 16:58:03 +02:00
CNugteren 04d28b0420 Made buffer copying a const-method for the source 2015-09-26 16:48:11 +02:00
CNugteren de6547a92b Added SBMV and SPMV routines 2015-09-19 18:01:19 +02:00
CNugteren 80da67d28b Added the HPMV routine 2015-09-19 17:40:38 +02:00
CNugteren c32c4a9739 Added infrastructure for packed matrices 2015-09-19 17:37:42 +02:00
CNugteren aebd156869 Added the HBMV routine 2015-09-19 11:11:34 +02:00
CNugteren 93dddda63e Improved the organization and performance of level 2 routines 2015-09-18 17:46:41 +02:00
CNugteren 4507ba4997 Added first version of banded matrix-vector multiplication 2015-09-18 15:25:20 +02:00
Cedric Nugteren 42db8ea968 Merge pull request #26 from CNugteren/routine_definitions
Generated API interface and implementations
2015-09-18 10:23:16 +02:00
CNugteren 4796c9bcbd Added generated main functions for correctness/performance tests for level 2 routines 2015-09-18 10:19:03 +02:00
CNugteren 6105ad6f5b Added interface of all level 2 routines 2015-09-17 17:05:45 +02:00
CNugteren 6307d2e5db Added script to generate API interface and implementation automatically 2015-09-17 10:14:33 +02:00
CNugteren 1c24210026 Made Travis always build pushes to the master branch 2015-09-14 17:16:31 +02:00
Cedric Nugteren a2b773573d Merge pull request #25 from CNugteren/level1_routines
Added several level 1 routines
2015-09-14 17:12:23 +02:00
CNugteren 224c967584 Removed routines from the table which are not supported by clBLAS 2015-09-14 17:02:33 +02:00
CNugteren a2e726d3bd Added xDOT/xDOTU/xDOTC dot-product routines 2015-09-14 16:57:00 +02:00
CNugteren 2a383f3450 Added extra temporary buffer to tuners in preparation of Xdot routines 2015-09-14 15:53:34 +02:00
CNugteren e0c5312abb Added support for the dot buffer and offset argument 2015-09-14 12:28:50 +02:00
CNugteren b0b81deae1 Minor update of options-printing syntax 2015-08-24 07:38:20 +02:00
CNugteren ff0c54c386 Added the XSWAP, XSCAL and XCOPY level-1 routines 2015-08-22 17:11:20 +02:00
CNugteren 75517353d5 Re-organized level1 xaxpy kernel 2015-08-22 14:33:48 +02:00
CNugteren 70ba7c83d4 Prepared the changelog for the next release 2015-08-22 12:50:26 +02:00
CNugteren 74f601794d Updated to version 0.4.0 2015-08-22 12:41:40 +02:00
CNugteren ff1a670e88 Updated the documentation 2015-08-22 12:40:18 +02:00
CNugteren 5f5d31754a Added clblast prefix to binaries and added the alltests target 2015-08-21 07:36:19 +02:00
Cedric Nugteren cf168fca70 Merge pull request #23 from CNugteren/tuner_database
Added initial version of a tuner-database
2015-08-20 08:38:18 +02:00
CNugteren 15db2bcc20 Added initial version of tuner-database Python script 2015-08-20 08:30:51 +02:00
CNugteren 07e393cce4 Added target to run all tuners 2015-08-19 19:35:56 +02:00
CNugteren 798a3b6101 Add check for supported precision to the tuners 2015-08-19 19:35:08 +02:00
CNugteren b46de22433 Moved precision tester to utilities 2015-08-19 19:34:29 +02:00
CNugteren 8a02db0746 Added precision to the JSON output 2015-08-19 11:12:42 +02:00
CNugteren cbd25bffea Added hotfix 8eeb7f721f 2015-08-19 11:12:16 +02:00