Commit graph

1378 commits

Author SHA1 Message Date
Gard Spreemann 3d3492646c Correct capitalization typo
The CLBlastConfig.cmake file was installed to a directory named
CLBLast (notice second capital l), which can cause issues for CMake's
search path when looking for CLBlast on the system.

This commit also fixes other occurrences of the wrong capitalization,
all of it purely cosmetic (i.e. in comments).
2021-04-30 10:27:22 +02:00
Cedric Nugteren ef5176dd96
Merge pull request #416 from JishinMaster/master
set the correct flop count for xgemm
2021-03-15 20:15:02 +01:00
JishinMaster aec45ea637 set the correct flop count for xgemm 2021-03-13 21:48:04 +01:00
Cedric Nugteren ce44c3adb5
Merge pull request #414 from CNugteren/CLBlast-412-python-runtime-libs-fix
Fix Windows paths in pyclblast
2021-02-06 13:30:24 +01:00
Cedric Nugteren 1fa0930d85 Fix Windows paths in pyclblast 2021-02-05 21:52:23 +01:00
Cedric Nugteren fe93153404
Merge pull request #413 from CNugteren/CLBlast-412-python-runtime-libs
Add library dir on Linux for pyclblast
2021-02-04 20:45:40 +01:00
Cedric Nugteren d57f8065ea Added second Windows library path 2021-02-04 20:13:02 +01:00
Cedric Nugteren c78c649844 Add library path for Windows as well 2021-01-30 14:28:11 +01:00
Cedric Nugteren bbcb357a71 Add library dir on Linux for pyclblast 2021-01-29 20:48:05 +01:00
Cedric Nugteren 07837a5c2d Update pyclblast package version number 2021-01-21 20:49:31 +01:00
Cedric Nugteren a5ef06ec57
Merge pull request #410 from jamesjer/master
Use reference types to prevent unnecessary copying
2021-01-21 19:56:18 +01:00
Jerry James dc82a1fbc8 Use reference types to prevent unnecessary copying 2021-01-20 10:21:36 -07:00
Cedric Nugteren 70016e8698 Updated to version 1.5.2 2021-01-19 21:19:12 +01:00
Cedric Nugteren 0ee39af5ed Add tuning results for TITAN RTX 2020-10-10 13:01:12 +02:00
Cedric Nugteren 481d86665f Add tuning results for Radeon RX Vega 2020-10-10 12:56:28 +02:00
Cedric Nugteren e6e2519eaa
Merge pull request #400 from baryluk/patch-6
Allow single graph / subplot on plot
2020-10-05 21:19:34 +02:00
Witold Baryluk ea199c3469
Allow single graph / subplot on plot
`plt.subplots` tries to be special, and return array or not-array depending on a number of subplots.

It is not actually helpful, and IMHO bad design.

Make it always `ndarray`.

The `and not type(axes) is np.ndarray`, is just in case matplotlib decides to make their behavior more uniform. For now work around it.

Also, no need for `ndarray.flat` really.

Confirmed to work with existing benchmarks (i.e. rows=2, cols=3), and with single graphs (rows=1, cols=1).
2020-10-05 12:11:17 +00:00
Cedric Nugteren 3462d7fa85
Merge pull request #399 from baryluk/patch-3
Fix a typo in benchmark when running fp 16 vs 32
2020-10-04 16:43:21 +02:00
Witold Baryluk eb967a0943
Fix a typo in benchmark when running fp 16 vs 32
The intention here was to limit the iteration range to common indexes only.

Fix that.
2020-10-04 10:22:00 +00:00
Cedric Nugteren 615e5f0ff2
Merge pull request #397 from baryluk/patch-1
Fix Python SyntaxWarning
2020-10-04 11:07:56 +02:00
Cedric Nugteren cdcfbbc8bc
Merge pull request #398 from baryluk/patch-2
Fix --load_from_disk argument help message
2020-10-04 11:07:09 +02:00
Witold Baryluk 2dfe7c5c23
Fix --load_from_disk argument help message 2020-10-04 08:17:16 +00:00
Witold Baryluk 45fd085395
Fix Python SyntaxWarning
There is no guarantee that all empty strings objects are the same or share object with `""` literal.
2020-10-04 08:12:50 +00:00
Cedric Nugteren 46fb748a96
Merge pull request #396 from CNugteren/CLBlast-395-fix-benchmark-script
Fix a Python 3 bug in the benchmark script
2020-10-03 10:50:43 +02:00
Cedric Nugteren 0abd62a0e7 Fix a Python 3 bug in the benchmark script 2020-10-02 20:32:58 +02:00
Cedric Nugteren b4cd2b04e9
Added FUNDING.yml file 2020-08-16 10:33:47 +02:00
Cedric Nugteren 41f344d1a6
Merge pull request #392 from 9prady9/fix_Program_getIR
Fix Program::GetIR to handle programs with multiple devices
2020-06-07 19:52:49 +02:00
Pradeep Garigipati dff65e9217 Add a cautionary note in Program::GetIR and mention the fix in CHANGELOG 2020-06-07 21:13:33 +05:30
Pradeep Garigipati aec71699f8
Fix Program::GetIR to handle programs with multiple devices 2020-06-05 12:00:45 +05:30
Cedric Nugteren da0e657d39
Merge pull request #389 from CNugteren/CLBlast-385-version-defines
Added version number defines
2020-05-13 20:28:58 +02:00
Cedric Nugteren 396ac0278a Added CLBLAST_VERSION_MAJOR/MINOR/PATCH defines in headers to store version numbering 2020-05-12 14:43:25 +02:00
Cedric Nugteren 0826bfe683
Merge pull request #388 from CNugteren/CLBlast-381-gemm-direct-tuner-failure
Fixed tuners global workgroup size
2020-05-11 22:39:48 +02:00
Cedric Nugteren c369cf1a16 Increase display width of the local/global sizes 2020-05-11 20:26:33 +02:00
Cedric Nugteren 4a6c7c37a3 Made sure that the global workgroup size is a multiple of the local size in the tuners 2020-05-10 20:28:23 +02:00
Cedric Nugteren 69a4b4d4b0 Added logging of local/global workgroup sizes when run the tuners 2020-05-10 20:08:28 +02:00
Cedric Nugteren 9abc416785
Merge pull request #386 from CNugteren/CLBlast-384-pyclblast-missing-routines
PyCLBlast: add missing batched routines
2020-05-10 18:23:41 +02:00
Cedric Nugteren 0870e76fba Updated PyCLBlast version number 2020-05-10 14:55:03 +02:00
Cedric Nugteren 0b7ce8033c Added a sample to demonstrate a batched routine 2020-05-10 14:54:50 +02:00
Cedric Nugteren b94e81af10 Added pyclblast bindings for the 3 batched routines 2020-05-10 12:26:25 +02:00
Cedric Nugteren 5f4b3ffcf7
Merge pull request #383 from CNugteren/CLBlast-382-improve-tuner
Move queue creation out of the tuner loop
2020-05-04 20:26:42 +02:00
Cedric Nugteren bbb2031bf3 Move queue creation out of the tuner loop 2020-05-03 20:30:55 +02:00
Cedric Nugteren 78300ccbea
Merge pull request #378 from CNugteren/CLBlast-377-fix-amax-amin
Change amax/amin behaviour
2020-03-15 11:34:31 +01:00
Cedric Nugteren 5f97d64505 Update API documentation 2020-03-08 11:29:47 +01:00
Cedric Nugteren b46853660e Made it more likely (but no guarantees) for amax/amin to return the first index 2020-03-08 11:26:49 +01:00
Cedric Nugteren 7fab29304c Added sample to play around with XAMAX routine 2020-03-08 11:26:18 +01:00
Cedric Nugteren e3ce88154a Silenced a new OpenCL warning message 2020-03-08 10:14:59 +01:00
Cedric Nugteren 8433985051 Updated to version 1.5.1 2020-02-18 10:29:40 +01:00
Cedric Nugteren bf4e4198b7
Merge pull request #376 from CNugteren/fix_tuner_exception_catching
Catches all exceptions of the tuners
2020-02-18 10:23:43 +01:00
Cedric Nugteren 49eb490ee1 Catches all exceptions of the tuners 2020-02-17 22:07:51 +01:00
Cedric Nugteren 8a19667e75
Merge pull request #372 from trantila/master
Reduced number of TestMatrix calls for the batched xgemm routines.
2019-12-15 09:33:53 +01:00