mirror of
https://github.com/CNugteren/CLBlast.git
synced 2024-07-07 12:23:46 +02:00
64 lines
1.7 KiB
Plaintext
64 lines
1.7 KiB
Plaintext
|
|
Development version (next release)
|
|
-
|
|
|
|
Version 0.5.0
|
|
- Improved structure and performance of level-2 routines (xSYMV/xHEMV)
|
|
- Reduced compilation time of level-3 OpenCL kernels
|
|
- Added level-1 routines:
|
|
* SSWAP/DSWAP/CSWAP/ZSWAP
|
|
* SSCAL/DSCAL/CSCAL/ZSCAL
|
|
* SCOPY/DCOPY/CCOPY/ZCOPY
|
|
* SDOT/DDOT
|
|
* CDOTU/ZDOTU
|
|
* CDOTC/ZDOTC
|
|
- Added level-2 routines:
|
|
* SGBMV/DGBMV/CGBMV/ZGBMV
|
|
* CHBMV/ZHBMV
|
|
* CHPMV/ZHPMV
|
|
* SSBMV/DSBMV
|
|
* SSPMV/DSPMV
|
|
* STRMV/DTRMV/CTRMV/ZTRMV
|
|
* STBMV/DTBMV/CTBMV/ZTBMV
|
|
* STPMV/DTPMV/CTPMV/ZTPMV
|
|
|
|
Version 0.4.0
|
|
- Now using the Claduc C++11 interface to OpenCL
|
|
- Added plain C API for increased compatibility (clblast_c.h)
|
|
- Re-organized tuner infrastructure and added JSON output
|
|
- Removed clBLAS sources, it should now be installed separately for testing
|
|
- Added Travis continuous integration
|
|
- Added level-2 routines:
|
|
* CHEMV/ZHEMV
|
|
* SSYMV/DSYMV
|
|
|
|
Version 0.3.0
|
|
- Re-organized test/client infrastructure to avoid code duplication
|
|
- Added an optional bypass for pre/post-processing kernels in level-3 routines
|
|
- Significantly improved performance of level-3 routines on AMD GPUs
|
|
- Added level-3 routines:
|
|
* CHEMM/ZHEMM
|
|
* SSYRK/DSYRK/CSYRK/ZSYRK
|
|
* CHERK/ZHERK
|
|
* SSYR2K/DSYR2K/CSYR2K/ZSYR2K
|
|
* CHER2K/ZHER2K
|
|
* STRMM/DTRMM/CTRMM/ZTRMM
|
|
|
|
Version 0.2.0
|
|
- Added support for complex conjugate transpose
|
|
- Several host-code performance improvements
|
|
- Improved testing infrastructure and coverage
|
|
- Added level-2 routines:
|
|
* SGEMV/DGEMV/CGEMV/ZGEMV
|
|
- Added level-3 routines:
|
|
* CGEMM/ZGEMM
|
|
* CSYMM/ZSYMM
|
|
|
|
Version 0.1.0
|
|
- Initial preview version release to GitHub
|
|
- Supported level-1 routines:
|
|
* SAXPY/DAXPY/CAXPY/ZAXPY
|
|
- Supported level-3 routines:
|
|
* SGEMM/DGEMM
|
|
* SSYMM/DSYMM
|