Commit graph

28 commits

Author SHA1 Message Date
Georgi Gerganov 6a81ed3e78 main : print colors + no timestamps 2022-10-22 21:17:21 +03:00
Georgi Gerganov 7affd309d3 whisper : add new-segment callback
Can be used to process new segments as they are being generated.
Sample usage in main, for printing the resulting segments during the
inference.
2022-10-22 21:17:21 +03:00
Georgi Gerganov 8f95c25aed main : refactor subtitle output 2022-10-22 21:17:21 +03:00
Georgi Gerganov 31ff0c6a1f wip : experimental color coding of tokens based on probabilities 2022-10-22 21:17:21 +03:00
Georgi Gerganov 7d0dee7a8a
ref #68 : add option "-on" to specify segment index offset for SRT
Also, change option "-o" to "-ot"
2022-10-21 18:14:53 +03:00
Georgi Gerganov e30cf83158
ref #57, #62, #63 : remove unions in C-api + remove designated initializers
We are not ready for designated initializers - many compilers do not
support this C++ feature yet, so removing it's non-trivial usages.
2022-10-18 18:17:24 +03:00
Georgi Gerganov 72d967bce4 Use Accelerate framework on Apple silicon
Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro)

Also various extra optimizations:

- Multi-threaded NORM operator
- Faster GELU via F16 cast
2022-10-18 00:12:51 +03:00
Topping1 50b5fe964c
Update main.cpp 2022-10-09 23:35:10 -05:00
Georgi Gerganov 4a6bf11db3 Minor 2022-10-08 18:13:26 +03:00
Georgi Gerganov 9bbca3110f ref #9 : add API documentation in whisper.h 2022-10-08 18:09:56 +03:00
Georgi Gerganov 2ca8cc77b2 ref #17 : print whisper logs to stderr
Only the transcribed/translted text is printed to stdout.
This way, one can redirect the result to a file.
2022-10-08 17:28:06 +03:00
Georgi Gerganov 8c7c018893 ref #17 : add options to output result to file
Support for:

- plain text
- VTT
- SRT
2022-10-08 17:22:22 +03:00
Georgi Gerganov 7787b878e1
ref #16, #22 : add "offset" argument
Allows to start processing the input audio at some offset from the
beginning. Useful for splitting a long job into multiple tasks.
2022-10-07 22:00:40 +03:00
Georgi Gerganov 700898e6ed
ref #22 : add option to provide multiple input .wav files 2022-10-05 23:44:10 +03:00
Georgi Gerganov ce1fe95902 wip : improve makefile 2022-10-05 23:03:46 +03:00
Артём Земляк 495b81b367 Fix: main get n_threads from cli 2022-10-05 09:47:48 +07:00
Артём Земляк f007e186fe Fix: main get language from cli args 2022-10-05 09:24:53 +07:00
Georgi Gerganov 6814cc9b02 Improve result printing 2022-10-04 23:18:15 +03:00
Georgi Gerganov eba33adadd Extend C-style API with full inference methods 2022-10-04 23:18:15 +03:00
Georgi Gerganov 6b77124e01 Initial C-style interface for whisper.cpp 2022-10-04 23:18:15 +03:00
Georgi Gerganov 77d929f603
Fix bug in FFT
The FFT routine does not work for odd N
Solution is to add DFT and use it when N is odd
2022-10-02 17:46:21 +03:00
Georgi Gerganov 6d654d192a
Fix reading of stereo WAV files 2022-10-01 08:41:57 +03:00
Georgi Gerganov 15b49e8baf
Bug fix
Longer prompts could cause out-of-bounds access
2022-09-30 20:37:29 +03:00
Georgi Gerganov 3bcdbdfc32
Reduce memory usage even more + better sampling
- The encode/decode memory buffers are now reused
- If the 30-sec segment goes for too long without a timestamp token, we
  force one. Improves transcription for large model
- Stereo support
- Add "micro-machines.wav" sample
2022-09-30 19:35:27 +03:00
Georgi Gerganov 5877c3578e
ref #4 : added transcription timestamps
Can be turned off with "-nt" argument.
Performance has also improved.
2022-09-29 23:09:39 +03:00
Georgi Gerganov f888c2373d
Flash + language support (ref #2)
- Achieved big performance improvement + memory usage reduction
- Can now translate / transcribe different languages
2022-09-28 21:07:32 +03:00
Georgi Gerganov 476182e439
Update README.md and simplify usage 2022-09-26 09:36:51 +03:00
Georgi Gerganov b0a11594ae
Initial release 2022-09-25 22:13:49 +03:00