Commit Graph

10 Commits (9a0b59d990be319952a4a02b9164b3b2327cd454)

Author SHA1 Message Date
Georgi Gerganov 2cdfc4e025
whisper : add support for large v3 (#1444)
* whisper : add support for large v3

* bench : fix build + fix go bindings

* bench : fix n_mels

* models : update readme
2023-11-07 15:30:18 +02:00
Akash Mahajan 3ec7bfffe0
py : make convert-pt-to-ggml.py backwards compatible with older vocab.json tokenizer files (#1001)
* patch checkpoint convert script to keep compatibility with older hf_transformers whisper tokenizer

* typo fix
2023-06-25 13:50:14 +03:00
AsukaMinato 94aa56f19e
minor : improve C++ and Python style (#768)
* use some STL functions

* use self.field than setattr, use pathlib.Path

* recover some format

* const some iter

* Keep the original

* 2 space
2023-04-29 10:06:25 +03:00
Ivan Gorin 62b51c3070
models : change convert-pt-to-ggml to use .tiktoken tokenizer files (#725) 2023-04-14 19:50:39 +03:00
Georgi Gerganov 00f46dbc1d
models : add usage comments to the HF convert script (#157) 2022-11-23 23:22:40 +02:00
Georgi Gerganov e70e5c8b53
models : simplify the conversion script
"transformers" dependency is not actually needed
2022-11-16 19:22:32 +02:00
Georgi Gerganov 46a68fb9b5
minor : remove one more redundant line 2022-11-11 18:02:58 +02:00
Georgi Gerganov ccd56a9c5b
minor : fix double float32 conversion in python script 2022-11-11 17:58:51 +02:00
Joonas Pihlajamaa 4e887dc350 Add enconding parameter to vocab.json opening to fix errors 2022-10-23 11:55:01 +03:00
Georgi Gerganov 6b45e37b2b Update README.md and finalize the whisper.wasm example 2022-10-22 18:54:01 +03:00