Merge pull request #324 from CNugteren/CLBlast-315-tuning-api-improvements

Made tuning API more flexible
CLBlast-267-convgemm
Cedric Nugteren 2018-10-14 17:26:13 +02:00 committed by GitHub
commit ff7bee93d3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 3 additions and 2 deletions

View File

@ -3,6 +3,7 @@ Development (next version)
- Added support for shuffle instructions for NVIDIA GPUs (thanks to 'tyler-utah')
- Added an option to compile the Netlib API with static OpenCL device and context (-DNETLIB_PERSISTENT_OPENCL=ON)
- The tuners now check beforehand on invalid local thread sizes and skip those completely
- Made the tuning API (OverrideParameters) more flexible, disregarding superfluous parameters
- Fixed an issue with conjugate transpose not being executed in certain cases for a.o. XOMATCOPY
- Fixed an issue with AMD GPUs and the new GEMMK == 1 kernel
- Fixed an issue with the preprocessor and the new GEMMK == 1 kernel

View File

@ -201,7 +201,7 @@ These two functions require/retrieve the parameters as given in [src/database/ke
| --------------------|-----------------------|
| Xaxpy | VW, WGS, WPT |
| Xdot | WGS1, WGS2 |
| Xgemv | WGS1, WPT1, UNROLL1 |
| Xgemv | WGS1, WPT1 |
| XgemvFast | VW2, WGS2, WPT2 |
| XgemvFastRot | VW3, WGS3, WPT3 |
| Xger | WGS1, WGS2, WPT |

View File

@ -161,7 +161,7 @@ StatusCode OverrideParameters(const RawDeviceID device, const std::string &kerne
// Verifies the parameters size
const auto current_parameter_names = current_database.GetParameterNames();
if (current_parameter_names.size() != parameters.size()) {
if (current_parameter_names.size() > parameters.size()) {
return StatusCode::kMissingOverrideParameter;
}