whisper.cpp

Author	SHA1	Message	Date
Georgi Gerganov	6a81ed3e78	main : print colors + no timestamps	2022-10-22 21:17:21 +03:00
Georgi Gerganov	7affd309d3	whisper : add new-segment callback Can be used to process new segments as they are being generated. Sample usage in main, for printing the resulting segments during the inference.	2022-10-22 21:17:21 +03:00
Georgi Gerganov	8f95c25aed	main : refactor subtitle output	2022-10-22 21:17:21 +03:00
Georgi Gerganov	31ff0c6a1f	wip : experimental color coding of tokens based on probabilities	2022-10-22 21:17:21 +03:00
Georgi Gerganov	7d0dee7a8a	ref #68 : add option "-on" to specify segment index offset for SRT Also, change option "-o" to "-ot"	2022-10-21 18:14:53 +03:00
Georgi Gerganov	e30cf83158	ref #57 , #62 , #63 : remove unions in C-api + remove designated initializers We are not ready for designated initializers - many compilers do not support this C++ feature yet, so removing it's non-trivial usages.	2022-10-18 18:17:24 +03:00
Georgi Gerganov	72d967bce4	Use Accelerate framework on Apple silicon Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro) Also various extra optimizations: - Multi-threaded NORM operator - Faster GELU via F16 cast	2022-10-18 00:12:51 +03:00
Topping1	50b5fe964c	Update main.cpp	2022-10-09 23:35:10 -05:00
Georgi Gerganov	4a6bf11db3	Minor	2022-10-08 18:13:26 +03:00
Georgi Gerganov	9bbca3110f	ref #9 : add API documentation in whisper.h	2022-10-08 18:09:56 +03:00
Georgi Gerganov	2ca8cc77b2	ref #17 : print whisper logs to stderr Only the transcribed/translted text is printed to stdout. This way, one can redirect the result to a file.	2022-10-08 17:28:06 +03:00
Georgi Gerganov	8c7c018893	ref #17 : add options to output result to file Support for: - plain text - VTT - SRT	2022-10-08 17:22:22 +03:00
Georgi Gerganov	7787b878e1	ref #16 , #22 : add "offset" argument Allows to start processing the input audio at some offset from the beginning. Useful for splitting a long job into multiple tasks.	2022-10-07 22:00:40 +03:00
Georgi Gerganov	700898e6ed	ref #22 : add option to provide multiple input .wav files	2022-10-05 23:44:10 +03:00
Georgi Gerganov	ce1fe95902	wip : improve makefile	2022-10-05 23:03:46 +03:00
Артём Земляк	495b81b367	Fix: main get n_threads from cli	2022-10-05 09:47:48 +07:00
Артём Земляк	f007e186fe	Fix: main get language from cli args	2022-10-05 09:24:53 +07:00
Georgi Gerganov	6814cc9b02	Improve result printing	2022-10-04 23:18:15 +03:00
Georgi Gerganov	eba33adadd	Extend C-style API with full inference methods	2022-10-04 23:18:15 +03:00
Georgi Gerganov	6b77124e01	Initial C-style interface for whisper.cpp	2022-10-04 23:18:15 +03:00
Georgi Gerganov	77d929f603	Fix bug in FFT The FFT routine does not work for odd N Solution is to add DFT and use it when N is odd	2022-10-02 17:46:21 +03:00
Georgi Gerganov	6d654d192a	Fix reading of stereo WAV files	2022-10-01 08:41:57 +03:00
Georgi Gerganov	15b49e8baf	Bug fix Longer prompts could cause out-of-bounds access	2022-09-30 20:37:29 +03:00
Georgi Gerganov	3bcdbdfc32	Reduce memory usage even more + better sampling - The encode/decode memory buffers are now reused - If the 30-sec segment goes for too long without a timestamp token, we force one. Improves transcription for large model - Stereo support - Add "micro-machines.wav" sample	2022-09-30 19:35:27 +03:00
Georgi Gerganov	5877c3578e	ref #4 : added transcription timestamps Can be turned off with "-nt" argument. Performance has also improved.	2022-09-29 23:09:39 +03:00
Georgi Gerganov	f888c2373d	Flash + language support (ref #2 ) - Achieved big performance improvement + memory usage reduction - Can now translate / transcribe different languages	2022-09-28 21:07:32 +03:00
Georgi Gerganov	476182e439	Update README.md and simplify usage	2022-09-26 09:36:51 +03:00
Georgi Gerganov	b0a11594ae	Initial release	2022-09-25 22:13:49 +03:00

28 commits