whisper.cpp

Commit Graph

Author	SHA1	Message	Date
Jhen-Jie Hong	a5e60c019d	readme : add react-native bindings (#619 )	2023-03-22 21:39:02 +02:00
Georgi Gerganov	1beff6f66d	models : change HF hosting from dataset to model	2023-03-22 20:44:56 +02:00
Georgi Gerganov	fa9d43181f	readme : add bench-wts.sh demo	2023-03-06 21:06:27 +02:00
Georgi Gerganov	ad1389003d	release : v1.2.1	2023-02-28 22:29:12 +02:00
Aaron Pham	d176160f6f	readme : add pybind11 bindings (#538 )	2023-02-27 21:02:11 +02:00
Georgi Gerganov	ca21f7ab16	readme : add cython bindings (#9 )	2023-02-24 08:46:06 +02:00
Georgi Gerganov	2407ae8ef0	readme : add Ruby discussion + update .NET discussion	2023-02-15 19:51:54 +02:00
Georgi Gerganov	9764782bd9	readme : add another .NET repo (#303 )	2023-02-14 20:04:03 +02:00
Georgi Gerganov	3b010f9bed	readme : add .NET repo (#303 )	2023-02-11 17:35:33 +02:00
Georgi Gerganov	b2083c5d02	release : v1.2.0	2023-02-04 09:49:49 +02:00
Georgi Gerganov	f3ee4a9673	whisper : reduce memory usage during inference (#431 ) * ggml : add "scratch" buffer support * ggml : support for scratch ring-buffer * ggml : bug fix in ggml_repeat() * ggml : error on scratch buffer overflow * whisper : use scratch buffers during inference (base model only) * whisper : update memory usage for all models * whisper : fix encoder memory usage * whisper : use whisper_context functions instead of macros * whisper : fix FF + remove it from README * ggml : reuse ggml_new_i32 * ggml : refactor the scratch buffer storage * whisper : reorder scratch buffers in the decoder * main : add option to disable temp fallback * Update README.md	2023-02-04 09:45:52 +02:00
Georgi Gerganov	2c3f50a021	release : v1.1.1	2023-01-23 20:23:44 +02:00
Georgi Gerganov	874bde887e	Update README.md	2023-01-16 18:47:31 +02:00
Georgi Gerganov	8738427dd6	cmake : bump version to 1.1.0	2023-01-15 14:33:13 +02:00
Georgi Gerganov	0b85e8c401	Update README.md	2023-01-15 11:36:20 +02:00
Georgi Gerganov	8de452c18b	Improve decoding (#291 ) * whisper : prepare infra for new decoding strategies * whisper : apply logit filters and compute logprobs * whisper : add whisper_get_logits() * whisper : separate self and cross attention memory Initial step needed for supporting parallel decoders * whisper : move probs_id buffer to whisper_context * whisper : refactor kv cache into separate struct * whisper : move self-attention kv cache to whisper_decoder * whisper : wip decoding parameters + strategies * whisper : wip decoding parameters + strategies (part 2) * whisper : wip decoding parameters + strategies (part 3) * whisper : wip decoding parameters + strategies (part 4) * whisper : fix prompt_past update to not include prompt_init * whisper : temperature + best_of support * whisper : support for compression_ration_threshold We actually use entropy, but it is similar * command : fix example to use logits instead of obsolete probs * whisper : handle empty sequence ranking * whisper : add WHISPER_DEBUG + diagnostic prints + new main args * whisper : minor fixes * whisper : add beam-search support * whisper : bug fix when there no previous context * whisper : add comments * stream : disable temperature fallback For real-time processing, we always want a single decoder running at T=0 * whisper.swiftui : update example - fix paths + add empty folders	2023-01-15 11:29:57 +02:00
Ian Bicking	5e9f33596f	readme : clarify main and stream usage (#391 ) Give an example of ./main that uses a sample file that's already there, and make the stream example clarify you need `make stream`	2023-01-08 20:18:41 +02:00
Thomas Fitzsimmons	1944e7c33e	whisper : document POWER VSX support	2023-01-05 23:53:00 +02:00
Georgi Gerganov	1480a5f1af	Update README.md Add SwiftUI example links	2022-12-23 11:02:46 +02:00
Georgi Gerganov	4c1fe0c813	Update README.md Add bindings links / discussions	2022-12-22 18:22:58 +02:00
Georgi Gerganov	afe2db0fe2	Add Roadmap	2022-12-16 23:41:57 +02:00
Georgi Gerganov	ea19ed33f1	Update README.md (#46 ) Add references to the new Android app	2022-12-16 19:28:51 +02:00
Georgi Gerganov	c37c2443c1	Update README.md (#56 )	2022-12-16 18:01:05 +02:00
Georgi Gerganov	812ae3ffbd	Update README.md	2022-12-12 20:20:51 +02:00
Georgi Gerganov	fcf515de60	bench.wasm : same as "bench" but runs in the browser (#89 )	2022-12-11 11:09:10 +02:00
Georgi Gerganov	3b1aacbe6d	talk : talk with AI in the terminal	2022-12-10 16:51:58 +02:00
Georgi Gerganov	3996ecc156	Update README.md	2022-12-07 05:15:46 +02:00
Georgi Gerganov	9fe7306f4b	models : add the new "large" model release by OpenAI The old "large" model is now renamed "large-v1". If you have been using it, make sure to rename it and download the new "large" model for best results.	2022-12-06 18:48:57 +02:00
Georgi Gerganov	6fd5358dd0	Update README.md	2022-11-27 11:30:32 +02:00
Georgi Gerganov	67e819baf4	minor : remove "examples/" prefix from the README	2022-11-26 13:07:54 +02:00
Georgi Gerganov	a425365b82	yt-wsp.sh : script to easily transcribe VODs Thanks to @DaniruKun ref: https://gist.github.com/DaniruKun/96f763ec1a037cc92fe1a059b643b818 Usage: cd whisper.cpp make ./examples/yt-wsp.sh <video-url>	2022-11-26 12:54:42 +02:00
Georgi Gerganov	e0e864d9ca	Update README.md	2022-11-26 11:56:55 +02:00
Georgi Gerganov	68ecadbbc9	command.wasm : add voice assistant example for the Web (#171 ) Same as the command-line tool "command", but runs in the browser Also, added helper script "extra/deploy-wasm.sh" and fixed some timing constants for the WASM examples.	2022-11-26 11:40:06 +02:00
Georgi Gerganov	1246dd023e	command : add demonstration video	2022-11-25 20:23:58 +02:00
Georgi Gerganov	bc88eb13c6	examples : add "command" tool (#171 )	2022-11-25 19:36:57 +02:00
Georgi Gerganov	b8ce25dec1	refactoring : more readable code	2022-11-25 19:28:04 +02:00
Georgi Gerganov	2c0501b38a	Update README.md	2022-11-24 20:06:51 +02:00
Georgi Gerganov	35cd29ce1f	ggml : fix cross-compile Linux -> Window with mingw (#168 )	2022-11-23 22:28:41 +02:00
Georgi Gerganov	a156a358ca	Revert "update README.md" This reverts commit `6a84147113`.	2022-11-23 22:16:50 +02:00
katsu560	6a84147113	update README.md	2022-11-23 22:16:33 +02:00
Georgi Gerganov	363a2dadec	Update README.md	2022-11-23 09:53:55 +02:00
Georgi Gerganov	623a486056	Update README.md	2022-11-23 09:52:36 +02:00
Georgi Gerganov	2e311a2917	Update README.md	2022-11-21 18:52:20 +02:00
Georgi Gerganov	864a78a8d0	models : change default hosting to Hugging Face My Linode is running out of monthly bandwidth due to the big interest in the project	2022-11-15 19:47:06 +02:00
Georgi Gerganov	8fdfb0ba92	Update README.md	2022-11-06 21:04:21 +02:00
Georgi Gerganov	a09e9123ca	Update README.md	2022-11-05 08:44:41 +02:00
Georgi Gerganov	0e689f83d8	Update README.md	2022-11-02 22:03:27 +02:00
Georgi Gerganov	d5afebd37c	whisper : token-level timestamp refactoring (#49 , #120 ) This turned out pretty good overall. The algorithm has been moved from main.cpp to whisper.cpp and can be reused for all subtitles types. This means that now you can specify the maximum length of the generated lines. Simply provide the "-ml" argument specifying the max length in number of characters	2022-11-02 21:45:54 +02:00
Georgi Gerganov	4b1c32e8ea	Update README.md	2022-11-02 18:33:29 +02:00
Georgi Gerganov	b5dde365e9	extra : compute SHA of all models files	2022-11-02 18:31:55 +02:00
Georgi Gerganov	e46bc56e71	Update README.md	2022-11-01 22:47:58 +02:00
Georgi Gerganov	b0f2aa0ea6	Update README.md	2022-10-30 17:10:46 +02:00
Georgi Gerganov	2c281d190b	Update README.md	2022-10-28 22:09:40 +03:00
Georgi Gerganov	9ccafa8792	Update README.md	2022-10-25 20:53:48 +03:00
Georgi Gerganov	89d8ee3ee5	Update README.md	2022-10-25 20:53:48 +03:00
Georgi Gerganov	c6710efde2	refactoring : move main + stream in examples + other stuff	2022-10-25 20:53:48 +03:00
Georgi Gerganov	728676927f	Update README.md	2022-10-24 18:26:21 +03:00
Georgi Gerganov	181b762de8	Update README.md	2022-10-23 12:47:51 +03:00
Georgi Gerganov	4196856c7b	Update README.md	2022-10-23 10:24:36 +03:00
Georgi Gerganov	705198f063	Update README.md	2022-10-23 10:12:10 +03:00
Georgi Gerganov	3e69a6071d	Update README.md	2022-10-23 08:04:33 +03:00
Georgi Gerganov	f3dae90c31	Update README.md	2022-10-22 21:17:21 +03:00
Georgi Gerganov	8c1d970088	Update README.md	2022-10-22 19:00:25 +03:00
Georgi Gerganov	6b45e37b2b	Update README.md and finalize the whisper.wasm example	2022-10-22 18:54:01 +03:00
Georgi Gerganov	5698b51718	Update README.md	2022-10-20 17:52:59 +03:00
Georgi Gerganov	3fe3898ebb	Update README.md	2022-10-20 17:43:56 +03:00
Georgi Gerganov	81c185576c	Update README.md	2022-10-20 17:39:31 +03:00
Georgi Gerganov	1969ee4bc7	Update README.md	2022-10-18 22:20:35 +03:00
Georgi Gerganov	72d967bce4	Use Accelerate framework on Apple silicon Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro) Also various extra optimizations: - Multi-threaded NORM operator - Faster GELU via F16 cast	2022-10-18 00:12:51 +03:00
Georgi Gerganov	36945162fa	Update README.md (ref #50 )	2022-10-15 09:40:08 +03:00
Georgi Gerganov	b2f1600aa3	Update README.md	2022-10-12 21:25:42 +03:00
Topping1	1348796a93	Update README.md (#43 ) * Update README.md Updated README.md to list new features, such as subtitle file support (VTT and SRT) * Update README.md Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2022-10-12 07:32:14 +03:00
Georgi Gerganov	8d94358251	Update README.md	2022-10-11 00:36:32 +03:00
Georgi Gerganov	ad6693fb64	Update README.md	2022-10-10 22:16:25 +03:00
Georgi Gerganov	63b6786767	Minor	2022-10-10 22:06:27 +03:00
Georgi Gerganov	f7ab81fe51	Update README.md	2022-10-10 22:05:37 +03:00
Georgi Gerganov	4c4ab71d4d	Update README.md	2022-10-08 11:46:34 +03:00
Georgi Gerganov	2d47693435	Update README.md	2022-10-08 11:43:42 +03:00
Georgi Gerganov	700898e6ed	ref #22 : add option to provide multiple input .wav files	2022-10-05 23:44:10 +03:00
Georgi Gerganov	6b1c3cc198	Update README.md	2022-10-05 23:13:15 +03:00
Georgi Gerganov	b8f713482e	Minor updates	2022-10-05 23:11:02 +03:00
Georgi Gerganov	e7a15876f8	Update README.md	2022-10-04 23:27:25 +03:00
Georgi Gerganov	d71e567656	Update README.md	2022-10-02 18:19:22 +03:00
Georgi Gerganov	62897e8ae6	Update README.md	2022-10-01 00:01:04 +03:00
Georgi Gerganov	3bcdbdfc32	Reduce memory usage even more + better sampling - The encode/decode memory buffers are now reused - If the 30-sec segment goes for too long without a timestamp token, we force one. Improves transcription for large model - Stereo support - Add "micro-machines.wav" sample	2022-09-30 19:35:27 +03:00
Georgi Gerganov	310f4883d1	Update README.md	2022-09-29 23:48:01 +03:00
Georgi Gerganov	fd3f3d748f	Update README.md	2022-09-29 23:37:59 +03:00
Georgi Gerganov	5877c3578e	ref #4 : added transcription timestamps Can be turned off with "-nt" argument. Performance has also improved.	2022-09-29 23:09:39 +03:00
Georgi Gerganov	4352a6018b	Update README.md	2022-09-28 21:13:32 +03:00
Georgi Gerganov	f888c2373d	Flash + language support (ref #2 ) - Achieved big performance improvement + memory usage reduction - Can now translate / transcribe different languages	2022-09-28 21:07:32 +03:00
Georgi Gerganov	476182e439	Update README.md and simplify usage	2022-09-26 09:36:51 +03:00
Georgi Gerganov	f2456f8d93	Create README.md	2022-09-25 22:59:04 +03:00

1 2 3

142 Commits (9a0b59d990be319952a4a02b9164b3b2327cd454)