Commit Graph

58 Commits (9a0b59d990be319952a4a02b9164b3b2327cd454)

Author SHA1 Message Date
st-gr eb23f4ef16
openvino : fix convert-whisper-to-openvino.py (#1890)
Fix issue: Conversion from Whisper to OpenVino failed #1870

convert-whisper-to-openvino.py stopped working with OpenVINO version 2023.0.0-10926-b4452d56304-releases/2023/0 .

Error was: TypeError: load(): incompatible function arguments. The following argument types are supported:
    1. (self: openvino._pyopenvino.FrontEnd, path: object) -> ov::frontend::InputModel

Tested successfully with a large-v3 conversion.

Co-authored-by: Stefan Grundmann <grundmanns@sandiego.gov>
2024-02-22 15:11:35 +02:00
Georgi Gerganov 3d42463845
models : add update py requirements 2024-02-13 11:51:32 +02:00
Michael Rienstra 4bbb60efce
docs : make model options / model install methods clearer (#1806)
* Make models more "discoverable"

* Clean up code block language identifiers

* make 3 options clearer

* undo Prettier formatter change

* docs: `$` shell prompt, consistently

* docs: minor changes
2024-01-26 17:39:54 +02:00
Sơn Phan Trung d05b7ee90e
models : make all scripts to be POSIX Compliant (#1725)
* download-coreml-model: make it POSIX-compliant

* download-ggml-model: posix compliant (2nd)

* minor edit

* forgot to add newline

* generate-coreml-interface: far more straightforward

* generate-coreml-model: done with the posix thingy

* typo

* Update download-ggml-model.sh

* fix

* fix typo

* another fix

* Update download-coreml-model.sh

* Update download-ggml-model.sh

* Update download-coreml-model.sh
2024-01-12 14:11:04 +02:00
Yajing Tang ba5bcde874
coreml : fix ANE optimized encoder (#1716) 2024-01-04 16:28:30 +02:00
Dimo a5cc3dc8a2
download : fix large q5 model name (#1695)
fixed typo in large-v3-q5-0 model name to match HF link
2023-12-29 11:14:32 +02:00
Chaoqun d2ee117a0a
docker : Dockerize whisper.cpp (#1674)
* build: add dockerfile for ci

* ci: add action to build/push docker image

* fix: lowercase repository to fix ci

* ci: update cuBLAS flag

* build: install curl and ffmped in image

* docs: add docker section

* fix: improve args check when download model
2023-12-22 11:16:02 +00:00
Georgi Gerganov c7606b47df
models : add info about distilled models 2023-11-15 21:10:13 +02:00
Georgi Gerganov bfbaa4dce5
whisper : make large version explicit + fix data size units (#1493) 2023-11-15 19:42:25 +02:00
bobqianic 953419c69a
openvino : update convert-whisper-to-openvino.py to support v3 (#1459) 2023-11-09 12:42:39 +02:00
Xiao-Yong Jin 0de8582f65
coreml : use the correct `n_mel` value (#1458) 2023-11-08 20:01:41 +00:00
Georgi Gerganov 2cdfc4e025
whisper : add support for large v3 (#1444)
* whisper : add support for large v3

* bench : fix build + fix go bindings

* bench : fix n_mels

* models : update readme
2023-11-07 15:30:18 +02:00
bobqianic 8a2bee6717
models : use absolute paths for the converted model (#1356) 2023-11-03 10:44:27 +02:00
WhiteOlivierus 45c87b5481
models : Faster download for models on windows using BitTransfer (#1404) 2023-10-30 19:18:12 +00:00
Xiang (Kevin) Li 91c0b23384
models : add conversion scripts from HuggingFace models to CoreML (#1304) 2023-10-04 12:00:25 +03:00
Neil Chudleigh aed5d40607
models : add quantum models to download-ggml-model.sh (#1235)
* Add quantized models to download-ggml-model.sh

* Update names in download-ggml-model script to normalized
2023-09-07 12:16:58 +03:00
Ryan Metcalfe 62b81276e0
whisper : add OpenVINO support (#1037)
* openvino: use OpenVINO encoder inference

* openvino: add python script for OpenVINO model generation

* whisper: Fix 'unused' warnings when OpenVINO isn't enabled in build

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* whisper: Fix compilation error

* whisper: revert whisper_get_openvino_path_encoder & whisper_get_openvino_path_cache to non-const func signatures

* cmake: Add openvino-encoder as separate object target

* whisper : minor style fixes

* minor : indentation fixes

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-04 15:56:11 +03:00
Akash Mahajan c8d0f5fe98
whisper : support speaker segmentation (local diarization) of mono audio via tinydiarize (#1058)
* add HuggingFace mirror to download  ggml model

* support tdrz via simple hack overriding solm tokens

* fix incorrect translate/transcribe token_ids that are not static const

* add apollo 13 sample for tdrz demo

* render [SPEAKER TURN] consistently in all terminal output using vocab.id_to_token

* extend whisper_segment with speaker_turn_next field and save in json output

* fix failing go build

* slipped in some python syntax whoops

* whisper : finalize tinydiarize support (add flag + fixes)

* whisper : tdrz support for word-level timestamps (respect max_len)

* java : try to fix tests after adding tdrz_enable flag

* main : remove TODO leftover

* java : fix params order list after adding "tdrz_enable"

* whisper : fix solm and add nosp token

* main : print tinydiarize help

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-04 09:45:00 +03:00
Simon Moisselin 6c68218e3c
models : add ggml_to_pt script (#1042)
* adding ggml_to_pt

* typo sys too many args

* fixing swap errors dimensions

---------

Co-authored-by: simonMoisselin <simon.moisselin@gmail.com>
2023-06-25 15:29:54 +03:00
Roddur Dasgupta f11f33f1c0
models : cd statements are quoted to allow spaces in path (#1041) 2023-06-25 15:27:28 +03:00
Georgi Gerganov 8ac23c9f77
models : handle paths with spaces in download script (close #1038) 2023-06-25 15:23:23 +03:00
Akash Mahajan 3ec7bfffe0
py : make convert-pt-to-ggml.py backwards compatible with older vocab.json tokenizer files (#1001)
* patch checkpoint convert script to keep compatibility with older hf_transformers whisper tokenizer

* typo fix
2023-06-25 13:50:14 +03:00
genevera (she/her) 9b926844e3
models : fix README.md (#964)
Fixes typo on line 76 of models/README.md
2023-05-27 10:40:28 +03:00
Ahmad Bilal 95b02d76b0
coreml : add support of large-v1 model (#926) 2023-05-15 18:36:06 +03:00
Clifford Heath 9931d66400
readme : add instructions on converting to GGML + "--no-config" to wget (#874) 2023-05-08 20:58:36 +03:00
AsukaMinato 94aa56f19e
minor : improve C++ and Python style (#768)
* use some STL functions

* use self.field than setattr, use pathlib.Path

* recover some format

* const some iter

* Keep the original

* 2 space
2023-04-29 10:06:25 +03:00
Georgi Gerganov 5e47e223bd
whisper : add Core ML support (#566)
* coreml : use Core ML encoder inference

* coreml : simlpify whisper_encode + log messages

* whisper : resolve rebase conflicts

* coreml : add scripts for CoreML model generation

* bench-all : recognize COREML flag
2023-04-15 13:21:27 +03:00
Ivan Gorin 62b51c3070
models : change convert-pt-to-ggml to use .tiktoken tokenizer files (#725) 2023-04-14 19:50:39 +03:00
be-next 18e6fb0287
models : handle spaces and special characters in shell script paths (#677)
This commit modifies the `get_script_path` function to correctly handle
spaces and special characters in directory paths. The fix involves adding
double quotes around variables and commands where needed to ensure proper
parsing of paths with spaces and special characters.
2023-03-29 23:38:33 +03:00
Kamilake 992aa2cd1b
models : change default encoding to utf8 (#605) 2023-03-22 21:17:24 +02:00
Georgi Gerganov 1beff6f66d
models : change HF hosting from dataset to model 2023-03-22 20:44:56 +02:00
Georgi Gerganov d629c034a4
models : fix HF model URL (close #356) 2023-01-02 09:54:43 +02:00
Ikko Ashimine 3467230a77 models : fix typo in convert-h5-to-ggml.py
signficant -> significant
2022-12-31 09:49:01 +02:00
Georgi Gerganov 77226aa89d
models : fix support for spaces in path (close #315) 2022-12-23 11:11:38 +02:00
Georgi Gerganov a613f16aec
talk : improve prompting 2022-12-12 23:44:36 +02:00
Kartik Saranathan d91c001120 Fix paths echoed after the download
Was using models path instead of root path
2022-12-08 09:23:52 +02:00
Georgi Gerganov 9fe7306f4b
models : add the new "large" model release by OpenAI
The old "large" model is now renamed "large-v1".
If you have been using it, make sure to rename it and download the new
"large" model for best results.
2022-12-06 18:48:57 +02:00
Georgi Gerganov abce28ea99
talk.wasm : move to https://whisper.ggerganov.com/talk
This way, we can share the same models across different WASM examples
and not have to download them for each page
2022-11-24 18:24:06 +02:00
Georgi Gerganov a2ecd54455
models : add instructions for using HF fine-tuned models 2022-11-24 17:54:41 +02:00
Georgi Gerganov 00f46dbc1d
models : add usage comments to the HF convert script (#157) 2022-11-23 23:22:40 +02:00
Georgi Gerganov 5698bddbc9
models : fix HF fine-tuned model conversion script (#157)
It works now
2022-11-23 23:14:11 +02:00
Georgi Gerganov d64d6ca3fd
models : minor changes to the HF convert script (#157) 2022-11-23 22:07:20 +02:00
Georgi Gerganov 93482d0373
models : add "convert-h5-to-ggml.py" script (#157)
Converts transformers models to ggml.
Although the conversion is successful, it does not work for some reason.
Not sure why
2022-11-23 17:19:22 +02:00
Georgi Gerganov e70e5c8b53
models : simplify the conversion script
"transformers" dependency is not actually needed
2022-11-16 19:22:32 +02:00
Dody Suria Wijaya 55a0e1a64e Update download-ggml-model.sh
follow curl redirect to new hosting site
2022-11-16 18:59:44 +02:00
Georgi Gerganov 864a78a8d0
models : change default hosting to Hugging Face
My Linode is running out of monthly bandwidth due to the big interest in
the project
2022-11-15 19:47:06 +02:00
Georgi Gerganov 46a68fb9b5
minor : remove one more redundant line 2022-11-11 18:02:58 +02:00
Georgi Gerganov ccd56a9c5b
minor : fix double float32 conversion in python script 2022-11-11 17:58:51 +02:00
Georgi Gerganov b5dde365e9
extra : compute SHA of all models files 2022-11-02 18:31:55 +02:00
Mikhail Grigorev b26345cc7b Added for Windows implemenated script download-ggml-model.cmd 2022-10-31 19:38:20 +02:00