Commit Graph

20 Commits (6e2ab6ee967c4a9b3350c7ce4e7d7b736c9e45f6)

Author SHA1 Message Date
Angus, Alexander 73f49e9b3d Updated according to feedback from CNugteren 2023-01-17 08:35:29 -08:00
Angus, Alexander 4f394608a2 implemented changes to boost Adreno performance according to https://jira-dc.qualcomm.com/jira/browse/OSR-8731 2023-01-03 10:56:04 -08:00
Koichi Akabe 301dc280df Fix xconvgemm kernel and enable ConvGemmMethod::kSingleKernel 2018-12-18 13:56:00 +09:00
Koichi Akabe a646d6ca46
Remove unnecessary attribute of inline function 2018-11-19 13:03:50 +09:00
Koichi Akabe 032e3b0cc0 Add kernel_mode option to im2col, col2im, and convgemm functions 2018-11-12 10:12:07 +09:00
Cedric Nugteren 6f67525ea6 Changed col2im to append to the existing im-buffer 2018-11-07 19:45:07 +01:00
Cedric Nugteren 2d32a23293 Added new col2im routine to the documentation 2018-11-01 21:46:19 +01:00
Koichi Akabe 0b3d04f709 Fix col2im implementation 2018-10-30 14:54:55 +09:00
Cedric Nugteren d45911b61d Added groundwork for col2im algorithm plus first non-working version of kernel and test 2018-10-23 20:52:25 +02:00
Cedric Nugteren c788e040f7 Added xCONVGEMM as im2col plus a batched GEMM kernel 2018-09-07 22:02:44 +02:00
Cedric Nugteren 838422fbb1 Further implemented single-kernel approach of convgemm; extended test to capture other parts of the kernel code 2018-05-21 11:47:16 +02:00
Cedric Nugteren 5d87abf780 Added method selection option to switch between im2col and single-kernel approach for convgemm 2018-05-21 11:28:11 +02:00
Cedric Nugteren 37cabd4f1f Moved new convgemm kernel to levelx kernel folder 2018-05-19 21:05:45 +02:00
Cedric Nugteren 27b52ac2c8 Second version of direct reading from image tensor for convgemm: also with local memory support now 2018-05-19 21:02:44 +02:00
Cedric Nugteren 9f02fb542c Completed kernel modifications for pre-processor of all other kernels 2017-12-09 20:44:21 +01:00
Cedric Nugteren 8905da259d Fixed a modulo and division issue manifesting on Apple OpenCL for im2col 2017-09-05 18:49:23 +02:00
Cedric Nugteren 297159d5b9 Fixed a bug in im2col: process only valid channel IDs 2017-08-31 21:58:12 +02:00
Cedric Nugteren 6194d43efb Fixed a bug in im2col confusing first and second workgroup size; made im2col kernel 2d instead of 3d 2017-08-31 20:34:10 +02:00
Cedric Nugteren 4d9d03ba51 Completed im2col implementation 2017-08-24 21:11:12 +02:00
Cedric Nugteren 803ca781f9 First version of im2col kernel, unoptimized but working 2017-08-19 18:25:13 +02:00