Actualize the README and remove the old ROADMAP (#471)

pull/473/head
Cedric Nugteren 2023-05-18 18:10:12 +02:00 committed by GitHub
parent db3bd0a32e
commit ce5e446fbe
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 6 additions and 45 deletions

View File

@ -5,8 +5,8 @@ CLBlast: The tuned OpenCL BLAS library
| Platform | Build status |
|-----|-----|
| Windows | [![Build Status](https://ci.appveyor.com/api/projects/status/github/cnugteren/clblast?branch=master&svg=true)](https://ci.appveyor.com/project/CNugteren/clblast) |
| Linux/macOS | ![Build Status](https://github.com/cnugteren/clblast/actions/workflows/build_and_test.yml/badge.svg?branch=master)
|
| Linux/macOS | [![Build Status](https://github.com/cnugteren/clblast/actions/workflows/build_and_test.yml/badge.svg?branch=master)](https://github.com/CNugteren/CLBlast/actions/workflows/build_and_test.yml) |
| Test machine (thanks to [ArrayFire](https://ci.arrayfire.org:8010/#/builders)) | Test status |
|-----|-----|
@ -18,9 +18,9 @@ CLBlast: The tuned OpenCL BLAS library
| clblast-windows-amd-r9 | [![Test Status](http://ci.arrayfire.org:8010/badges/clblast-windows-amd-r9.svg)](http://ci.arrayfire.org:8010/#/builders/clblast-windows-amd-r9) |
| clblast-windows-nvidia-m6000 | [![Test Status](http://ci.arrayfire.org:8010/badges/clblast-windows-nvidia-m6000.svg)](http://ci.arrayfire.org:8010/#/builders/clblast-windows-nvidia-m6000) |
CLBlast is a modern, lightweight, performant and tunable OpenCL BLAS library written in C++11. It is designed to leverage the full performance potential of a wide variety of OpenCL devices from different vendors, including desktop and laptop GPUs, embedded GPUs, and other accelerators. CLBlast implements BLAS routines: basic linear algebra subprograms operating on vectors and matrices. See [the CLBlast website](https://cnugteren.github.io/clblast) for performance reports on various devices as well as the latest CLBlast news.
CLBlast is a lightweight, performant and tunable OpenCL BLAS library written in C++11. It is designed to leverage the full performance potential of a wide variety of OpenCL devices from different vendors, including desktop and laptop GPUs, embedded GPUs, and other accelerators. CLBlast implements BLAS routines: basic linear algebra subprograms operating on vectors and matrices. See [the CLBlast website](https://cnugteren.github.io/clblast) for performance reports on some devices.
The library is not tuned for all possible OpenCL devices: __if out-of-the-box performance is poor, please run the tuners first__. See below for a list of already tuned devices and instructions on how to tune yourself and contribute to future releases of the CLBlast library. See also the [CLBlast feature roadmap](ROADMAP.md) to get an indication of the future of CLBlast.
The library is not tuned for all possible OpenCL devices: __if out-of-the-box performance is poor, please run the tuners first__. See below for a list of already tuned devices and instructions on how to tune yourself and contribute to future releases of the CLBlast library.
Why CLBlast and not clBLAS or cuBLAS?
@ -121,21 +121,7 @@ Contributing
Contributions are welcome in the form of tuning results for OpenCL devices previously untested or pull requests. See [the contributing guidelines](CONTRIBUTING.md) for more details.
The main contributing authors (code, pull requests, testing) are:
* [Cedric Nugteren](http://cnugteren.github.io) - main author
* [Anton Lokhmotov](https://github.com/psyhtest)
* [Dragan Djuric](https://github.com/blueberry)
* [Marco Hutter](http://marco-hutter.de/)
* [Hugh Perkins](https://github.com/hughperkins)
* [Gian-Carlo Pascutto](https://github.com/gcp)
* [Ivan Shapovalov](https://github.com/intelfx)
* [Dimitri Van Assche](https://github.com/dvasschemacq)
* [Shehzan Mohammed](https://shehzan10.github.io)
* [Marco Cianfriglia](https://github.com/mcian)
* [Kodonnell](https://github.com/kodonnell)
* [Koichi Akabe](https://github.com/vbkaisetsu)
* Everyone else listed as a [GitHub contributor](https://github.com/CNugteren/CLBlast/graphs/contributors)
The main contributing authors (code, pull requests, testing) can be found in the list of[GitHub contributors](https://github.com/CNugteren/CLBlast/graphs/contributors).
Tuning and testing on a variety of OpenCL devices was made possible by:
@ -172,4 +158,4 @@ How to cite this work:
Support us
-------------
This project started in March 2015 as an evenings and weekends free-time project next to a full-time job for Cedric Nugteren. If you are in the position to support the project by OpenCL-hardware donations or otherwise, please find contact information on the [website of the main author](http://cnugteren.github.io).
This project started in March 2015 as an evenings and weekends free-time project next to a full-time job for Cedric Nugteren. You can find contact information on the [website of the main author](http://cnugteren.github.io).

View File

@ -1,25 +0,0 @@
CLBlast feature road-map
================
This file gives an overview of the main features planned for addition to CLBlast. A first-order indication time-frame for development time is provided:
| Issue# | When | Who | Status | What |
| ---------------------------------------------------------------|-------------|-----------|--------|---------------|
| - | Oct '17 | CNugteren | ✔ | CUDA API for CLBlast |
| [#169](https://github.com/CNugteren/CLBlast/issues/169) & #195 | Oct-Nov '17 | CNugteren | ✔ | Auto-tuning the kernel selection parameter |
| [#181](https://github.com/CNugteren/CLBlast/issues/181) & #201 | Nov '17 | CNugteren | ✔ | Compilation for Android and testing on a device |
| - | Nov '17 | CNugteren | ✔ | Integration of CLTune for easy testing on Android / fewer dependencies |
| [#128](https://github.com/CNugteren/CLBlast/issues/128) & #205 | Nov-Dec '17 | CNugteren | ✔ | Pre-processor for loop unrolling and array-to-register-promotion for e.g. ARM Mali |
| [#207](https://github.com/CNugteren/CLBlast/issues/207) | Dec '17 | CNugteren | ✔ | Tuning of the TRSM/TRSV routines |
| [#195](https://github.com/CNugteren/CLBlast/issues/195) | Jan '18 | CNugteren | ✔ | Extra GEMM API with pre-allocated temporary buffer |
| [#95](https://github.com/CNugteren/CLBlast/issues/95) & #237 | Jan '18 | CNugteren | ✔ | Implement strided batch GEMM |
| [#224](https://github.com/CNugteren/CLBlast/issues/224) | Jan-Feb '18 | CNugteren | ✔ | Implement Hadamard product (element-wise vector-vector product) |
| [#233](https://github.com/CNugteren/CLBlast/issues/233) | Feb '18 | CNugteren | ✔ | Add CLBlast to common package managers |
| [#223](https://github.com/CNugteren/CLBlast/issues/223) | Feb '18 | CNugteren | ✔ | Python OpenCL interface |
| [#237](https://github.com/CNugteren/CLBlast/issues/237) | Mar '18 | CNugteren | ✔ | Making tuning possible from the CLBlast API |
| [#228](https://github.com/CNugteren/CLBlast/issues/228) | Mar-Apr '18 | CNugteren | ✔ | Improving performance for Qualcomm Adreno GPUs |
| [#270](https://github.com/CNugteren/CLBlast/issues/270) | Oct '18 | CNugteren | ✔ | Implement col2im |
| - | ?? | CNugteren | | Add support for OpenCL image buffers |
| [#267](https://github.com/CNugteren/CLBlast/issues/267) | Jan '19 | vbkaisetsu| ✔ | Merge im2col and GEMM into a direct kernel |
| [#136](https://github.com/CNugteren/CLBlast/issues/136) | ?? | CNugteren | | Implement xAXPBY and xSET |
| [#169](https://github.com/CNugteren/CLBlast/issues/169) | ?? | dividiti | | Problem-specific tuning parameter selection |