Cedric Nugteren
eefe0df435
Made functions with scalar-buffers as output properly return values
2016-11-20 21:36:57 +01:00
Cedric Nugteren
8ae8ab06a2
Renamed the include and source files of the Netlib CBLAS API
2016-10-25 20:33:10 +02:00
Cedric Nugteren
140121ef91
Removed the clblast namespace from the Netlib C API source file to ensure proper linking
2016-10-25 20:21:50 +02:00
Cedric Nugteren
729862e873
Fixed some issues preventing the Netlib CBLAS API from linking correctly
2016-10-25 19:56:42 +02:00
Cedric Nugteren
926aca53a0
Made the Netlib CBLAS API use the same enums with prefixes as the regular C API of CLBlast
2016-10-25 19:45:57 +02:00
Cedric Nugteren
59183b7d79
Sets the proper sizes for the buffers for the Netlib CBLAS API
2016-10-25 19:21:49 +02:00
Cedric Nugteren
f96fd372bc
Added initial version of a Netlib CBLAS implementation. TODO: Set correct buffer sizes
2016-10-25 14:28:52 +02:00
Cedric Nugteren
3b65eace0a
Merge branch 'development' into netlib_blas_api
...
Conflicts:
scripts/generator/generator.py
scripts/generator/generator/routine.py
2016-10-25 09:34:24 +02:00
Cedric Nugteren
a670c4c4bf
All enums in the C API are now prefixed with CLBlast to avoid potential name clashes with other projects
2016-10-22 16:14:56 +02:00
Cedric Nugteren
4a5516aa78
Added extra error codes to reflect the more detailed error reporting of OpenCL functions
2016-10-22 15:46:29 +02:00
Ivan Shapovalov
56f300607b
Routine: get rid of ::SetUp()
...
Since we now use C++ exceptions inside the implementation (and exceptions
can be thrown from constructors), there is no need for a separate
Routine::SetUp() function.
For this, we also change the way how the kernel source string is constructed.
The kernel-specific source code is now passed to the Routine ctor via
an initializer_list of C strings to avoid unnecessary data copying
while also working around C1091 of MSVC 2013.
2016-10-22 08:45:27 +03:00
Ivan Shapovalov
b98af44fcf
treewide: use C++ exceptions properly
...
Since the codebase is designed around proper C++ idioms such as RAII, it
makes sense to only use C++ exceptions internally instead of mixing
exceptions and error codes. The exceptions are now caught at top level
to preserve compatibility with the existing error code-based API.
Note that we deliberately do not catch C++ runtime errors (such as
`std::bad_alloc`) nor logic errors (aka failed assertions) because no
actual handling can ever happen for such errors.
However, in the C interface we do catch _all_ exceptions (...) and
convert them into a wild-card error code.
2016-10-22 08:45:25 +03:00
Cedric Nugteren
8d5747aa54
Made non-standard types void-pointers in the Netlib BLAS interface
2016-10-05 08:23:54 +02:00
Cedric Nugteren
a17b714c3e
Added first version of Netlib BLAS API header
2016-10-05 00:09:39 +02:00
Cedric Nugteren
a2f8350703
Refactored the Python C++ generator script; now confirms to the PEP8 styleguide
2016-09-04 21:26:30 +02:00
Cedric Nugteren
b330ab0866
Added declspec(dllexport) to ClearCache and FillCache, and added declspec(dllimport) when not building the library
2016-06-30 10:49:17 +02:00
Cedric Nugteren
61203453aa
Renamed all C++ source files to .cpp to match the .hpp extension better
2016-06-19 13:55:49 +02:00
Cedric Nugteren
f726fbdc9f
Moved all headers into the source tree, changed headers to .hpp extension
2016-06-18 20:20:13 +02:00
Cedric Nugteren
bacb5d2bb2
Clean-up of the routine class, moved RunKernel to the routine/common file
2016-06-18 18:16:14 +02:00
Cedric Nugteren
52ccaf5b25
Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and/or transposing
2016-06-16 18:07:46 +02:00
Cedric Nugteren
995a528cec
Improved API documentation and added documentation for level-2 and level-3 routines
2016-06-13 20:17:26 +02:00
Cedric Nugteren
4fb8f9517c
Added documentation for the matrix-update level-2 family of routines
2016-06-10 11:16:06 +02:00
Cedric Nugteren
e561e3fbd5
Added return value to the test binaries (0: success, 1: failure), allowing it to work under CTest properly
2016-06-02 16:24:22 +02:00
Cedric Nugteren
03182f9d07
Added half-precision tests for the clBLAS reference through conversion to single-precision
2016-05-26 23:36:19 +02:00
Cedric Nugteren
b487d4dd44
Added half-precision tests for the CBLAS reference through conversion to single-precison
2016-05-26 13:15:27 +02:00
Cedric Nugteren
4612ff3552
Added possibility to run the performance client with half-precision
2016-05-25 14:37:26 +02:00
Cedric Nugteren
9f87455070
Added level-3 half-precision routines HGEMM/HSYMM/HSYRK/HSYR2K/HTRMM
2016-05-25 13:29:53 +02:00
Cedric Nugteren
3e9a07f00a
Added level-2 half-precision routines HGER/HSYR/HSPR/HSYR2/HSPR2
2016-05-22 16:59:14 +02:00
Cedric Nugteren
95b828da12
Added level-2 half-precision routines HGEMV/HGBMV/HHEMV/HHBMV/HHPMV/HSYMV/HSBMV/HSPMV/HTRMV/HTBMV/HTPMV
2016-05-22 15:38:26 +02:00
Cedric Nugteren
803aaf3070
Added level-1 half-precision routines HSWAP/HSCAL/HCOPY/HAXPY/HDOT/HNRM2/HASUM/HSUM/iHAMAX/iHMAX/iHMIN
2016-05-22 14:47:14 +02:00
Cedric Nugteren
f2ba75890c
Initial changes in preparation for half-precision fp16 support
2016-05-12 19:56:21 +02:00
cnugteren
3b81ee2c08
Fixed an issue where the xAMAX tester would incorrectly report failures when testing against CBLAS
2016-05-08 18:28:01 +02:00
cnugteren
eaf1de5745
Fixed an issue where the xNRM2 and xASUM testers would incorrectly report failures for complex inputs
2016-05-08 18:07:55 +02:00
Cedric Nugteren
ed2904a344
Added preliminary generated API documentation
2016-05-08 09:49:00 +02:00
Cedric Nugteren
aa97c836b1
Fixed an issue with linking against the ATLAS BLAS library
2016-05-04 19:16:09 +02:00
Cedric Nugteren
bee2f943ec
Changed the index buffer of IxAMAX routines to unsigned int for proper buffersize checking
2016-05-01 14:03:37 +02:00
Cedric Nugteren
e113ff0852
Added non-aboslute minimum counter-part IxMIN of the BLAS routine IxAMAX
2016-04-30 09:49:39 +02:00
Cedric Nugteren
877aad693f
Added FillCache: a function to pre-compile all kernels for a specific device
2016-04-29 23:33:12 +02:00
Cedric Nugteren
d7ddbdeb1f
Added non-absolute counter-parts xSUM and IxMAX of the BLAS routines xASUM and IxAMAX
2016-04-27 18:07:30 +02:00
Cedric Nugteren
8075934ca7
Added prototypes for non-BLAS routines: xSUM and IxMAX (non-absolute counterparts of xASUM and IxAMAX)
2016-04-27 17:06:19 +02:00
Cedric Nugteren
82be8f211c
Moved all cache-related functions to a separate file; added a ClearCompiledProgramCache function to clear the cache
2016-04-27 16:02:13 +02:00
Cedric Nugteren
3555cd0436
All CLBlast enum constants now have the same raw values as in the cblas standard
2016-04-27 11:37:55 +02:00
cnugteren
16a048f1ac
Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routines
2016-04-20 22:12:51 -06:00
cnugteren
894983fc3c
Added prototype for ixAMAX routines
2016-04-20 21:11:33 -06:00
cnugteren
8be99de82d
Added support for the SASUM/DASUM/ScASUM/DzASUM routines
2016-04-14 19:58:26 -06:00
cnugteren
e0497807e2
Added prototype for xASUM routines
2016-04-13 21:44:49 -06:00
cnugteren
1d3d38a261
Events are now properly implemented using event waiting list and asking the user to wait for event completion
2016-04-09 22:22:24 -06:00
cnugteren
1a82861a90
Added support for testing (performance and correctness) against a CPU BLAS library
2016-04-02 11:58:00 -07:00
cnugteren
5c83217cf2
Added a wrapper for CBLAS libraries for performance/correctness testing
2016-04-01 22:36:39 -07:00
cnugteren
8c3c6db7d0
Merge branch 'level1_routines' into development
2016-03-30 21:37:56 -07:00
Cedric Nugteren
c1df786764
Added prototypes for the xROTM and xROTMG routines
2016-03-30 16:13:37 -07:00
Cedric Nugteren
6ecc0d089c
Added prototypes for the xROT and xROTG functions
2016-03-30 16:13:32 -07:00
Cedric Nugteren
6e5f558746
Made event an optional argument in the CLBlast C++ API
2016-03-30 16:13:26 -07:00
Cedric Nugteren
aaa687ca98
Added preliminary support for the xNRM2 routines
2016-03-28 23:00:44 +02:00
Cedric Nugteren
1d5a702d9d
Added prototypes for ScNRM2/DzNRM2 routines
2016-03-25 10:30:38 +01:00
Cedric Nugteren
3876096c30
Added prototypes for SNRM2/DNRM2 routines
2016-03-25 10:00:40 +01:00
Cedric Nugteren
49822c8ead
Fixed the C-api export to be able to properly build a DLL on Windows
2016-03-23 20:49:28 +01:00
Cedric Nugteren
d935695417
Added __declspec(dllexport) to create a DLL on Windows
2016-03-19 11:09:09 +01:00
Cedric Nugteren
306bf67660
Added preliminary support for xHPR2 and xSPR2 routines
2016-03-06 15:48:11 +01:00
Cedric Nugteren
60da54da5d
Added preliminary support for xHER2 and xSYR2 routines
2016-03-02 21:18:01 +01:00
Cedric Nugteren
e3545215a5
Added support for xHER, xHPR, xSYR, and xSPR routines
2016-02-28 14:16:48 +01:00
Cedric Nugteren
9f682aa66b
Set a proper default precision for the CLBlast clients
2016-02-20 14:41:53 +01:00
Cedric Nugteren
6dc44da07b
Added support for xGERU and xGERC routines
2016-02-20 14:15:41 +01:00
Cedric Nugteren
8854a73127
Added XGER routine, kernel, and tuner
2016-02-20 12:40:01 +01:00
CNugteren
2b56c2c603
Added TRMV/TBMV/TPMV routines
2015-09-26 16:58:03 +02:00
CNugteren
de6547a92b
Added SBMV and SPMV routines
2015-09-19 18:01:19 +02:00
CNugteren
80da67d28b
Added the HPMV routine
2015-09-19 17:40:38 +02:00
CNugteren
aebd156869
Added the HBMV routine
2015-09-19 11:11:34 +02:00
CNugteren
4507ba4997
Added first version of banded matrix-vector multiplication
2015-09-18 15:25:20 +02:00
CNugteren
4796c9bcbd
Added generated main functions for correctness/performance tests for level 2 routines
2015-09-18 10:19:03 +02:00
CNugteren
6105ad6f5b
Added interface of all level 2 routines
2015-09-17 17:05:45 +02:00
CNugteren
6307d2e5db
Added script to generate API interface and implementation automatically
2015-09-17 10:14:33 +02:00