Development version (next release) - Added level-1 routines: * SSWAP/DSWAP/CSWAP/ZSWAP * SSCAL/DSCAL/CSCAL/ZSCAL * SCOPY/DCOPY/CCOPY/ZCOPY * SDOT/DDOT * CDOTU/ZDOTU * CDOTC/ZDOTC - Added level-2 routines: * SGBMV/DGBMV/CGBMV/ZGBMV Version 0.4.0 - Now using the Claduc C++11 interface to OpenCL - Added plain C API for increased compatibility (clblast_c.h) - Re-organized tuner infrastructure and added JSON output - Removed clBLAS sources, it should now be installed separately for testing - Added Travis continuous integration - Added level-2 routines: * CHEMV/ZHEMV * SSYMV/DSYMV Version 0.3.0 - Re-organized test/client infrastructure to avoid code duplication - Added an optional bypass for pre/post-processing kernels in level-3 routines - Significantly improved performance of level-3 routines on AMD GPUs - Added level-3 routines: * CHEMM/ZHEMM * SSYRK/DSYRK/CSYRK/ZSYRK * CHERK/ZHERK * SSYR2K/DSYR2K/CSYR2K/ZSYR2K * CHER2K/ZHER2K * STRMM/DTRMM/CTRMM/ZTRMM Version 0.2.0 - Added support for complex conjugate transpose - Several host-code performance improvements - Improved testing infrastructure and coverage - Added level-2 routines: * SGEMV/DGEMV/CGEMV/ZGEMV - Added level-3 routines: * CGEMM/ZGEMM * CSYMM/ZSYMM Version 0.1.0 - Initial preview version release to GitHub - Supported level-1 routines: * SAXPY/DAXPY/CAXPY/ZAXPY - Supported level-3 routines: * SGEMM/DGEMM * SSYMM/DSYMM