Added documentation for the OverrideParameters function

This commit is contained in:
Cedric Nugteren 2017-02-18 11:02:57 +01:00
parent d6538dfc25
commit fef11a208c
4 changed files with 34 additions and 3 deletions

View file

@ -5,6 +5,7 @@ Development version (next release)
- Fixed a bug when using offsets in the direct version of the GEMM kernels
- Fixed a missing cl_khr_fp64 when running double-precision on Intel CPUs
- Tests now also exit with an error code when OpenCL errors or compilation errors occur
- Added the OverrideParameters function to the API to be able to supply custom tuning parmeters
- Various minor fixes and enhancements
- Added tuned parameters for various devices (see README)

View file

@ -156,7 +156,7 @@ Note that CLBlast's tuners are based on the [CLTune auto-tuning library](https:/
Compiling with `-DTUNERS=ON` will generate a number of tuners, each named `clblast_tuner_xxxxx`, in which `xxxxx` corresponds to a `.opencl` kernel file as found in `src/kernels`. These kernels corresponds to routines (e.g. `xgemm`) or to common pre-processing or post-processing kernels (`copy` and `transpose`). Running such a tuner will test a number of parameter-value combinations on your device and report which one gave the best performance. Running `make alltuners` runs all tuners for all precisions in one go. You can set the default device and platform for `alltuners` by setting the `CLBLAST_DEVICE` and `CLBLAST_PLATFORM` environmental variables.
The tuners output a JSON-file with the results. The best results need to be added to `src/database/kernels/xxxxx.hpp` in the appropriate section. However, this can be done automatically based on the JSON-data using a Python script in `scripts/database/database.py`. If you want the found parameters to be included in future releases of CLBlast, please attach the JSON files to the corresponding issue on GitHub or [email the main author](http://www.cedricnugteren.nl).
The tuners output a JSON-file with the results. The best results need to be added to `src/database/kernels/xxxxx.hpp` in the appropriate section. However, this can be done automatically based on the JSON-data using a Python (2.7 or 3.x) script in `scripts/database/database.py`. If you want the found parameters to be included in future releases of CLBlast, please attach the JSON files to the corresponding issue on GitHub or [email the main author](http://www.cedricnugteren.nl).
In summary, tuning the entire library for your device can be done as follows (starting from the root of the CLBlast folder):
@ -168,6 +168,8 @@ In summary, tuning the entire library for your device can be done as follows (st
python ../scripts/database/database.py . ..
make
Alternatively, you can also supply your tuning parameters programmatically through the CLBlast API. This is especially useful if you tune for specific non-standard arguments (e.g. a rectangular or a very small matrix). To do so, you can call the `OverrideParameters` function which will set new parameters for a specific kernel. At the first next call of the target routine, CLBlast will compile a new binary and use it together with the new parameters from then on. Until `OverrideParameters` is called again of course. See the [API documentation](doc/clblast.md#overrideparameters-override-tuning-parameters-auxiliar-function) for more details.
Compiling the correctness tests (optional)
-------------

View file

@ -2816,3 +2816,31 @@ CLBlastStatusCode CLBlastFillCache(const cl_device_id device)
Arguments to FillCache:
* `const cl_device_id device`: The OpenCL device to fill the cache for.
OverrideParameters: Override tuning parameters (auxiliary function)
-------------
This function overrides tuning parameters for a specific device-precision-kernel combination. The next time the target routine is called it will be re-compiled and use the new parameters. All further times (until `OverrideParameters` is called again) it will load the kernel from the cache and thus continue to use the new parameters. Note that the first time after calling `OverrideParameters` a performance drop can be observable due to the re-compilation of the kernel.
C++ API:
```
StatusCode OverrideParameters(const cl_device_id device, const std::string &kernel_name,
const Precision precision,
const std::unordered_map<std::string,size_t> &parameters)
```
C API:
```
CLBlastStatusCode CLBlastOverrideParameters(const cl_device_id device, const char* kernel_name,
const CLBlastPrecision precision, const size_t num_parameters,
const char** parameters_names, const size_t* parameters_values)
```
Arguments to OverrideParameters (C++ version):
* `const cl_device_id device`: The OpenCL device to set the new parameters for.
* `const std::string &kernel_name`: The target kernel name. This has to be one of the existing CLBlast kernels (Xaxpy, Xdot, Xgemv, XgemvFast, XgemvFastRot, Xgemv, Xger, Copy, Pad, Transpose, Padtranspose, Xgemm, or XgemmDirect). If this argument is incorrect, this function will return with the `clblast::kInvalidOverrideKernel` status-code.
* `const Precision precision`: The CLBlast precision enum to set the new parameters for.
* `const std::unordered_map<std::string,size_t> &parameters`: An unordered map of strings to integers. This has to contain all the tuning parameters for a specific kernel as reported by the included tuners (e.g. `{ {"COPY_DIMX",8}, {"COPY_DIMY",32}, {"COPY_VW",4}, {"COPY_WPT",8} }` for the `Copy` kernel). If this argument is incorrect, this function will return with the `clblast::kMissingOverrideParameter` status-code.

View file

@ -42,9 +42,9 @@ FILES = [
"/src/clblast_netlib_c.cpp",
]
HEADER_LINES = [121, 73, 125, 23, 29, 41, 65, 32]
FOOTER_LINES = [26, 139, 28, 38, 6, 6, 9, 2]
FOOTER_LINES = [25, 139, 27, 38, 6, 6, 9, 2]
HEADER_LINES_DOC = 0
FOOTER_LINES_DOC = 35
FOOTER_LINES_DOC = 63
# Different possibilities for requirements
ald_m = "The value of `a_ld` must be at least `m`."