From ce5e446fbe9221b4888a199403a40d4293df97b8 Mon Sep 17 00:00:00 2001 From: Cedric Nugteren Date: Thu, 18 May 2023 18:10:12 +0200 Subject: [PATCH] Actualize the README and remove the old ROADMAP (#471) --- README.md | 26 ++++++-------------------- ROADMAP.md | 25 ------------------------- 2 files changed, 6 insertions(+), 45 deletions(-) delete mode 100644 ROADMAP.md diff --git a/README.md b/README.md index bad6d9f7..a1ff2791 100644 --- a/README.md +++ b/README.md @@ -5,8 +5,8 @@ CLBlast: The tuned OpenCL BLAS library | Platform | Build status | |-----|-----| | Windows | [![Build Status](https://ci.appveyor.com/api/projects/status/github/cnugteren/clblast?branch=master&svg=true)](https://ci.appveyor.com/project/CNugteren/clblast) | -| Linux/macOS | ![Build Status](https://github.com/cnugteren/clblast/actions/workflows/build_and_test.yml/badge.svg?branch=master) - | +| Linux/macOS | [![Build Status](https://github.com/cnugteren/clblast/actions/workflows/build_and_test.yml/badge.svg?branch=master)](https://github.com/CNugteren/CLBlast/actions/workflows/build_and_test.yml) | + | Test machine (thanks to [ArrayFire](https://ci.arrayfire.org:8010/#/builders)) | Test status | |-----|-----| @@ -18,9 +18,9 @@ CLBlast: The tuned OpenCL BLAS library | clblast-windows-amd-r9 | [![Test Status](http://ci.arrayfire.org:8010/badges/clblast-windows-amd-r9.svg)](http://ci.arrayfire.org:8010/#/builders/clblast-windows-amd-r9) | | clblast-windows-nvidia-m6000 | [![Test Status](http://ci.arrayfire.org:8010/badges/clblast-windows-nvidia-m6000.svg)](http://ci.arrayfire.org:8010/#/builders/clblast-windows-nvidia-m6000) | -CLBlast is a modern, lightweight, performant and tunable OpenCL BLAS library written in C++11. It is designed to leverage the full performance potential of a wide variety of OpenCL devices from different vendors, including desktop and laptop GPUs, embedded GPUs, and other accelerators. CLBlast implements BLAS routines: basic linear algebra subprograms operating on vectors and matrices. See [the CLBlast website](https://cnugteren.github.io/clblast) for performance reports on various devices as well as the latest CLBlast news. +CLBlast is a lightweight, performant and tunable OpenCL BLAS library written in C++11. It is designed to leverage the full performance potential of a wide variety of OpenCL devices from different vendors, including desktop and laptop GPUs, embedded GPUs, and other accelerators. CLBlast implements BLAS routines: basic linear algebra subprograms operating on vectors and matrices. See [the CLBlast website](https://cnugteren.github.io/clblast) for performance reports on some devices. -The library is not tuned for all possible OpenCL devices: __if out-of-the-box performance is poor, please run the tuners first__. See below for a list of already tuned devices and instructions on how to tune yourself and contribute to future releases of the CLBlast library. See also the [CLBlast feature roadmap](ROADMAP.md) to get an indication of the future of CLBlast. +The library is not tuned for all possible OpenCL devices: __if out-of-the-box performance is poor, please run the tuners first__. See below for a list of already tuned devices and instructions on how to tune yourself and contribute to future releases of the CLBlast library. Why CLBlast and not clBLAS or cuBLAS? @@ -121,21 +121,7 @@ Contributing Contributions are welcome in the form of tuning results for OpenCL devices previously untested or pull requests. See [the contributing guidelines](CONTRIBUTING.md) for more details. -The main contributing authors (code, pull requests, testing) are: - -* [Cedric Nugteren](http://cnugteren.github.io) - main author -* [Anton Lokhmotov](https://github.com/psyhtest) -* [Dragan Djuric](https://github.com/blueberry) -* [Marco Hutter](http://marco-hutter.de/) -* [Hugh Perkins](https://github.com/hughperkins) -* [Gian-Carlo Pascutto](https://github.com/gcp) -* [Ivan Shapovalov](https://github.com/intelfx) -* [Dimitri Van Assche](https://github.com/dvasschemacq) -* [Shehzan Mohammed](https://shehzan10.github.io) -* [Marco Cianfriglia](https://github.com/mcian) -* [Kodonnell](https://github.com/kodonnell) -* [Koichi Akabe](https://github.com/vbkaisetsu) -* Everyone else listed as a [GitHub contributor](https://github.com/CNugteren/CLBlast/graphs/contributors) +The main contributing authors (code, pull requests, testing) can be found in the list of[GitHub contributors](https://github.com/CNugteren/CLBlast/graphs/contributors). Tuning and testing on a variety of OpenCL devices was made possible by: @@ -172,4 +158,4 @@ How to cite this work: Support us ------------- -This project started in March 2015 as an evenings and weekends free-time project next to a full-time job for Cedric Nugteren. If you are in the position to support the project by OpenCL-hardware donations or otherwise, please find contact information on the [website of the main author](http://cnugteren.github.io). +This project started in March 2015 as an evenings and weekends free-time project next to a full-time job for Cedric Nugteren. You can find contact information on the [website of the main author](http://cnugteren.github.io). diff --git a/ROADMAP.md b/ROADMAP.md deleted file mode 100644 index c1db9850..00000000 --- a/ROADMAP.md +++ /dev/null @@ -1,25 +0,0 @@ -CLBlast feature road-map -================ - -This file gives an overview of the main features planned for addition to CLBlast. A first-order indication time-frame for development time is provided: - -| Issue# | When | Who | Status | What | -| ---------------------------------------------------------------|-------------|-----------|--------|---------------| -| - | Oct '17 | CNugteren | ✔ | CUDA API for CLBlast | -| [#169](https://github.com/CNugteren/CLBlast/issues/169) & #195 | Oct-Nov '17 | CNugteren | ✔ | Auto-tuning the kernel selection parameter | -| [#181](https://github.com/CNugteren/CLBlast/issues/181) & #201 | Nov '17 | CNugteren | ✔ | Compilation for Android and testing on a device | -| - | Nov '17 | CNugteren | ✔ | Integration of CLTune for easy testing on Android / fewer dependencies | -| [#128](https://github.com/CNugteren/CLBlast/issues/128) & #205 | Nov-Dec '17 | CNugteren | ✔ | Pre-processor for loop unrolling and array-to-register-promotion for e.g. ARM Mali | -| [#207](https://github.com/CNugteren/CLBlast/issues/207) | Dec '17 | CNugteren | ✔ | Tuning of the TRSM/TRSV routines | -| [#195](https://github.com/CNugteren/CLBlast/issues/195) | Jan '18 | CNugteren | ✔ | Extra GEMM API with pre-allocated temporary buffer | -| [#95](https://github.com/CNugteren/CLBlast/issues/95) & #237 | Jan '18 | CNugteren | ✔ | Implement strided batch GEMM | -| [#224](https://github.com/CNugteren/CLBlast/issues/224) | Jan-Feb '18 | CNugteren | ✔ | Implement Hadamard product (element-wise vector-vector product) | -| [#233](https://github.com/CNugteren/CLBlast/issues/233) | Feb '18 | CNugteren | ✔ | Add CLBlast to common package managers | -| [#223](https://github.com/CNugteren/CLBlast/issues/223) | Feb '18 | CNugteren | ✔ | Python OpenCL interface | -| [#237](https://github.com/CNugteren/CLBlast/issues/237) | Mar '18 | CNugteren | ✔ | Making tuning possible from the CLBlast API | -| [#228](https://github.com/CNugteren/CLBlast/issues/228) | Mar-Apr '18 | CNugteren | ✔ | Improving performance for Qualcomm Adreno GPUs | -| [#270](https://github.com/CNugteren/CLBlast/issues/270) | Oct '18 | CNugteren | ✔ | Implement col2im | -| - | ?? | CNugteren | | Add support for OpenCL image buffers | -| [#267](https://github.com/CNugteren/CLBlast/issues/267) | Jan '19 | vbkaisetsu| ✔ | Merge im2col and GEMM into a direct kernel | -| [#136](https://github.com/CNugteren/CLBlast/issues/136) | ?? | CNugteren | | Implement xAXPBY and xSET | -| [#169](https://github.com/CNugteren/CLBlast/issues/169) | ?? | dividiti | | Problem-specific tuning parameter selection |