whisper.cpp/extra
Georgi Gerganov b0502836b8
whisper : add full CUDA and Metal offloading (#1472)
* whisper : migrate to ggml-backend

* whisper : fix logit reading

* whisper : fix tensor allocation during load

* whisper : fix beam-search with CUDA

* whisper : free backends + fix compile warning

* whisper : print when CUDA is enabled

* whisper : fix CoreML

* make : clean-up

* talk : fix compile warning

* whisper : support ggml_conv with CUDA and Metal (#1473)

* ggml : add CUDA support for ggml_conv

* whisper : remove ggml_repeat for conv bias + single backend

* cuda : fix im2col kernel

* metal : add im2col support + mul mat-vec f16 x f16

* bench-all : add q4 models

* whisper : clean-up

* quantize-all : fix

* ggml : im2col opts

* whisper : avoid whisper_model_data wrapper

* whisper : add note that ggml_mul_mat_pad does not work with CUDA

* whisper : factor out graph compute in common function

* whisper : fixes

* whisper : fix UB with measure buffers

* whisper : try to fix the parallel whisper_state functionality (#1479)

* whisper : try to fix the parallel whisper_state functionality

* whisper : fix multi-state Metal

* whisper : free backend instances in whisper_state
2023-11-12 15:31:08 +02:00
..
bench-all.sh whisper : add full CUDA and Metal offloading (#1472) 2023-11-12 15:31:08 +02:00
bench-wts.sh bench-wts.sh : rename script + add execute permission 2023-03-06 21:02:24 +02:00
bench.py extra: Add benchmark script implemented in Python (#1298) 2023-09-25 23:45:15 +08:00
convert-all.sh whisper : add support for large v3 (#1444) 2023-11-07 15:30:18 +02:00
deploy-wasm.sh Node.js package (#260) 2022-12-12 20:17:27 +02:00
quantize-all.sh whisper : add full CUDA and Metal offloading (#1472) 2023-11-12 15:31:08 +02:00
sha-all.sh extra : compute SHA of all models files 2022-11-02 18:31:55 +02:00
sync-ggml.sh cuda : fix HIPBLAS build 2023-11-05 19:41:15 +02:00