Commit Graph

142 Commits (9a0b59d990be319952a4a02b9164b3b2327cd454)

Author SHA1 Message Date
Jhen-Jie Hong a5e60c019d
readme : add react-native bindings (#619) 2023-03-22 21:39:02 +02:00
Georgi Gerganov 1beff6f66d
models : change HF hosting from dataset to model 2023-03-22 20:44:56 +02:00
Georgi Gerganov fa9d43181f
readme : add bench-wts.sh demo 2023-03-06 21:06:27 +02:00
Georgi Gerganov ad1389003d
release : v1.2.1 2023-02-28 22:29:12 +02:00
Aaron Pham d176160f6f
readme : add pybind11 bindings (#538) 2023-02-27 21:02:11 +02:00
Georgi Gerganov ca21f7ab16
readme : add cython bindings (#9) 2023-02-24 08:46:06 +02:00
Georgi Gerganov 2407ae8ef0
readme : add Ruby discussion + update .NET discussion 2023-02-15 19:51:54 +02:00
Georgi Gerganov 9764782bd9
readme : add another .NET repo (#303) 2023-02-14 20:04:03 +02:00
Georgi Gerganov 3b010f9bed
readme : add .NET repo (#303) 2023-02-11 17:35:33 +02:00
Georgi Gerganov b2083c5d02
release : v1.2.0 2023-02-04 09:49:49 +02:00
Georgi Gerganov f3ee4a9673
whisper : reduce memory usage during inference (#431)
* ggml : add "scratch" buffer support

* ggml : support for scratch ring-buffer

* ggml : bug fix in ggml_repeat()

* ggml : error on scratch buffer overflow

* whisper : use scratch buffers during inference (base model only)

* whisper : update memory usage for all models

* whisper : fix encoder memory usage

* whisper : use whisper_context functions instead of macros

* whisper : fix FF + remove it from README

* ggml : reuse ggml_new_i32

* ggml : refactor the scratch buffer storage

* whisper : reorder scratch buffers in the decoder

* main : add option to disable temp fallback

* Update README.md
2023-02-04 09:45:52 +02:00
Georgi Gerganov 2c3f50a021
release : v1.1.1 2023-01-23 20:23:44 +02:00
Georgi Gerganov 874bde887e
Update README.md 2023-01-16 18:47:31 +02:00
Georgi Gerganov 8738427dd6
cmake : bump version to 1.1.0 2023-01-15 14:33:13 +02:00
Georgi Gerganov 0b85e8c401
Update README.md 2023-01-15 11:36:20 +02:00
Georgi Gerganov 8de452c18b
Improve decoding (#291)
* whisper : prepare infra for new decoding strategies

* whisper : apply logit filters and compute logprobs

* whisper : add whisper_get_logits()

* whisper : separate self and cross attention memory

Initial step needed for supporting parallel decoders

* whisper : move probs_id buffer to whisper_context

* whisper : refactor kv cache into separate struct

* whisper : move self-attention kv cache to whisper_decoder

* whisper : wip decoding parameters + strategies

* whisper : wip decoding parameters + strategies (part 2)

* whisper : wip decoding parameters + strategies (part 3)

* whisper : wip decoding parameters + strategies (part 4)

* whisper : fix prompt_past update to not include prompt_init

* whisper : temperature + best_of support

* whisper : support for compression_ration_threshold

We actually use entropy, but it is similar

* command : fix example to use logits instead of obsolete probs

* whisper : handle empty sequence ranking

* whisper : add WHISPER_DEBUG + diagnostic prints + new main args

* whisper : minor fixes

* whisper : add beam-search support

* whisper : bug fix when there no previous context

* whisper : add comments

* stream : disable temperature fallback

For real-time processing, we always want a single decoder running at T=0

* whisper.swiftui : update example - fix paths + add empty folders
2023-01-15 11:29:57 +02:00
Ian Bicking 5e9f33596f
readme : clarify main and stream usage (#391)
Give an example of ./main that uses a sample file that's already there, and make the stream example clarify you need `make stream`
2023-01-08 20:18:41 +02:00
Thomas Fitzsimmons 1944e7c33e whisper : document POWER VSX support 2023-01-05 23:53:00 +02:00
Georgi Gerganov 1480a5f1af
Update README.md
Add SwiftUI example links
2022-12-23 11:02:46 +02:00
Georgi Gerganov 4c1fe0c813
Update README.md
Add bindings links / discussions
2022-12-22 18:22:58 +02:00
Georgi Gerganov afe2db0fe2
Add Roadmap 2022-12-16 23:41:57 +02:00
Georgi Gerganov ea19ed33f1
Update README.md (#46)
Add references to the new Android app
2022-12-16 19:28:51 +02:00
Georgi Gerganov c37c2443c1
Update README.md (#56) 2022-12-16 18:01:05 +02:00
Georgi Gerganov 812ae3ffbd
Update README.md 2022-12-12 20:20:51 +02:00
Georgi Gerganov fcf515de60
bench.wasm : same as "bench" but runs in the browser (#89) 2022-12-11 11:09:10 +02:00
Georgi Gerganov 3b1aacbe6d talk : talk with AI in the terminal 2022-12-10 16:51:58 +02:00
Georgi Gerganov 3996ecc156
Update README.md 2022-12-07 05:15:46 +02:00
Georgi Gerganov 9fe7306f4b
models : add the new "large" model release by OpenAI
The old "large" model is now renamed "large-v1".
If you have been using it, make sure to rename it and download the new
"large" model for best results.
2022-12-06 18:48:57 +02:00
Georgi Gerganov 6fd5358dd0
Update README.md 2022-11-27 11:30:32 +02:00
Georgi Gerganov 67e819baf4
minor : remove "examples/" prefix from the README 2022-11-26 13:07:54 +02:00
Georgi Gerganov a425365b82
yt-wsp.sh : script to easily transcribe VODs
Thanks to @DaniruKun
ref: https://gist.github.com/DaniruKun/96f763ec1a037cc92fe1a059b643b818

Usage:

  cd whisper.cpp
  make

  ./examples/yt-wsp.sh <video-url>
2022-11-26 12:54:42 +02:00
Georgi Gerganov e0e864d9ca
Update README.md 2022-11-26 11:56:55 +02:00
Georgi Gerganov 68ecadbbc9
command.wasm : add voice assistant example for the Web (#171)
Same as the command-line tool "command", but runs in the browser

Also, added helper script "extra/deploy-wasm.sh" and fixed some timing
constants for the WASM examples.
2022-11-26 11:40:06 +02:00
Georgi Gerganov 1246dd023e
command : add demonstration video 2022-11-25 20:23:58 +02:00
Georgi Gerganov bc88eb13c6
examples : add "command" tool (#171) 2022-11-25 19:36:57 +02:00
Georgi Gerganov b8ce25dec1
refactoring : more readable code 2022-11-25 19:28:04 +02:00
Georgi Gerganov 2c0501b38a
Update README.md 2022-11-24 20:06:51 +02:00
Georgi Gerganov 35cd29ce1f
ggml : fix cross-compile Linux -> Window with mingw (#168) 2022-11-23 22:28:41 +02:00
Georgi Gerganov a156a358ca
Revert "update README.md"
This reverts commit 6a84147113.
2022-11-23 22:16:50 +02:00
katsu560 6a84147113 update README.md 2022-11-23 22:16:33 +02:00
Georgi Gerganov 363a2dadec
Update README.md 2022-11-23 09:53:55 +02:00
Georgi Gerganov 623a486056
Update README.md 2022-11-23 09:52:36 +02:00
Georgi Gerganov 2e311a2917
Update README.md 2022-11-21 18:52:20 +02:00
Georgi Gerganov 864a78a8d0
models : change default hosting to Hugging Face
My Linode is running out of monthly bandwidth due to the big interest in
the project
2022-11-15 19:47:06 +02:00
Georgi Gerganov 8fdfb0ba92
Update README.md 2022-11-06 21:04:21 +02:00
Georgi Gerganov a09e9123ca
Update README.md 2022-11-05 08:44:41 +02:00
Georgi Gerganov 0e689f83d8
Update README.md 2022-11-02 22:03:27 +02:00
Georgi Gerganov d5afebd37c
whisper : token-level timestamp refactoring (#49, #120)
This turned out pretty good overall. The algorithm has been moved from
main.cpp to whisper.cpp and can be reused for all subtitles types. This
means that now you can specify the maximum length of the generated
lines. Simply provide the "-ml" argument specifying the max length in
number of characters
2022-11-02 21:45:54 +02:00
Georgi Gerganov 4b1c32e8ea
Update README.md 2022-11-02 18:33:29 +02:00
Georgi Gerganov b5dde365e9
extra : compute SHA of all models files 2022-11-02 18:31:55 +02:00
Georgi Gerganov e46bc56e71
Update README.md 2022-11-01 22:47:58 +02:00
Georgi Gerganov b0f2aa0ea6
Update README.md 2022-10-30 17:10:46 +02:00
Georgi Gerganov 2c281d190b
Update README.md 2022-10-28 22:09:40 +03:00
Georgi Gerganov 9ccafa8792 Update README.md 2022-10-25 20:53:48 +03:00
Georgi Gerganov 89d8ee3ee5 Update README.md 2022-10-25 20:53:48 +03:00
Georgi Gerganov c6710efde2 refactoring : move main + stream in examples + other stuff 2022-10-25 20:53:48 +03:00
Georgi Gerganov 728676927f
Update README.md 2022-10-24 18:26:21 +03:00
Georgi Gerganov 181b762de8
Update README.md 2022-10-23 12:47:51 +03:00
Georgi Gerganov 4196856c7b
Update README.md 2022-10-23 10:24:36 +03:00
Georgi Gerganov 705198f063
Update README.md 2022-10-23 10:12:10 +03:00
Georgi Gerganov 3e69a6071d
Update README.md 2022-10-23 08:04:33 +03:00
Georgi Gerganov f3dae90c31 Update README.md 2022-10-22 21:17:21 +03:00
Georgi Gerganov 8c1d970088
Update README.md 2022-10-22 19:00:25 +03:00
Georgi Gerganov 6b45e37b2b Update README.md and finalize the whisper.wasm example 2022-10-22 18:54:01 +03:00
Georgi Gerganov 5698b51718
Update README.md 2022-10-20 17:52:59 +03:00
Georgi Gerganov 3fe3898ebb
Update README.md 2022-10-20 17:43:56 +03:00
Georgi Gerganov 81c185576c
Update README.md 2022-10-20 17:39:31 +03:00
Georgi Gerganov 1969ee4bc7
Update README.md 2022-10-18 22:20:35 +03:00
Georgi Gerganov 72d967bce4 Use Accelerate framework on Apple silicon
Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro)

Also various extra optimizations:

- Multi-threaded NORM operator
- Faster GELU via F16 cast
2022-10-18 00:12:51 +03:00
Georgi Gerganov 36945162fa
Update README.md (ref #50) 2022-10-15 09:40:08 +03:00
Georgi Gerganov b2f1600aa3
Update README.md 2022-10-12 21:25:42 +03:00
Topping1 1348796a93
Update README.md (#43)
* Update README.md

Updated README.md to list new features, such as subtitle file support (VTT and SRT)

* Update README.md

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2022-10-12 07:32:14 +03:00
Georgi Gerganov 8d94358251
Update README.md 2022-10-11 00:36:32 +03:00
Georgi Gerganov ad6693fb64
Update README.md 2022-10-10 22:16:25 +03:00
Georgi Gerganov 63b6786767
Minor 2022-10-10 22:06:27 +03:00
Georgi Gerganov f7ab81fe51
Update README.md 2022-10-10 22:05:37 +03:00
Georgi Gerganov 4c4ab71d4d
Update README.md 2022-10-08 11:46:34 +03:00
Georgi Gerganov 2d47693435 Update README.md 2022-10-08 11:43:42 +03:00
Georgi Gerganov 700898e6ed
ref #22 : add option to provide multiple input .wav files 2022-10-05 23:44:10 +03:00
Georgi Gerganov 6b1c3cc198
Update README.md 2022-10-05 23:13:15 +03:00
Georgi Gerganov b8f713482e
Minor updates 2022-10-05 23:11:02 +03:00
Georgi Gerganov e7a15876f8
Update README.md 2022-10-04 23:27:25 +03:00
Georgi Gerganov d71e567656
Update README.md 2022-10-02 18:19:22 +03:00
Georgi Gerganov 62897e8ae6
Update README.md 2022-10-01 00:01:04 +03:00
Georgi Gerganov 3bcdbdfc32
Reduce memory usage even more + better sampling
- The encode/decode memory buffers are now reused
- If the 30-sec segment goes for too long without a timestamp token, we
  force one. Improves transcription for large model
- Stereo support
- Add "micro-machines.wav" sample
2022-09-30 19:35:27 +03:00
Georgi Gerganov 310f4883d1
Update README.md 2022-09-29 23:48:01 +03:00
Georgi Gerganov fd3f3d748f
Update README.md 2022-09-29 23:37:59 +03:00
Georgi Gerganov 5877c3578e
ref #4 : added transcription timestamps
Can be turned off with "-nt" argument.
Performance has also improved.
2022-09-29 23:09:39 +03:00
Georgi Gerganov 4352a6018b
Update README.md 2022-09-28 21:13:32 +03:00
Georgi Gerganov f888c2373d
Flash + language support (ref #2)
- Achieved big performance improvement + memory usage reduction
- Can now translate / transcribe different languages
2022-09-28 21:07:32 +03:00
Georgi Gerganov 476182e439
Update README.md and simplify usage 2022-09-26 09:36:51 +03:00
Georgi Gerganov f2456f8d93
Create README.md 2022-09-25 22:59:04 +03:00