Commit graph

22 commits

Author SHA1 Message Date
Georgi Gerganov 72d967bce4 Use Accelerate framework on Apple silicon
Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro)

Also various extra optimizations:

- Multi-threaded NORM operator
- Faster GELU via F16 cast
2022-10-18 00:12:51 +03:00
Topping1 50b5fe964c
Update main.cpp 2022-10-09 23:35:10 -05:00
Georgi Gerganov 4a6bf11db3 Minor 2022-10-08 18:13:26 +03:00
Georgi Gerganov 9bbca3110f ref #9 : add API documentation in whisper.h 2022-10-08 18:09:56 +03:00
Georgi Gerganov 2ca8cc77b2 ref #17 : print whisper logs to stderr
Only the transcribed/translted text is printed to stdout.
This way, one can redirect the result to a file.
2022-10-08 17:28:06 +03:00
Georgi Gerganov 8c7c018893 ref #17 : add options to output result to file
Support for:

- plain text
- VTT
- SRT
2022-10-08 17:22:22 +03:00
Georgi Gerganov 7787b878e1
ref #16, #22 : add "offset" argument
Allows to start processing the input audio at some offset from the
beginning. Useful for splitting a long job into multiple tasks.
2022-10-07 22:00:40 +03:00
Georgi Gerganov 700898e6ed
ref #22 : add option to provide multiple input .wav files 2022-10-05 23:44:10 +03:00
Georgi Gerganov ce1fe95902 wip : improve makefile 2022-10-05 23:03:46 +03:00
Артём Земляк 495b81b367 Fix: main get n_threads from cli 2022-10-05 09:47:48 +07:00
Артём Земляк f007e186fe Fix: main get language from cli args 2022-10-05 09:24:53 +07:00
Georgi Gerganov 6814cc9b02 Improve result printing 2022-10-04 23:18:15 +03:00
Georgi Gerganov eba33adadd Extend C-style API with full inference methods 2022-10-04 23:18:15 +03:00
Georgi Gerganov 6b77124e01 Initial C-style interface for whisper.cpp 2022-10-04 23:18:15 +03:00
Georgi Gerganov 77d929f603
Fix bug in FFT
The FFT routine does not work for odd N
Solution is to add DFT and use it when N is odd
2022-10-02 17:46:21 +03:00
Georgi Gerganov 6d654d192a
Fix reading of stereo WAV files 2022-10-01 08:41:57 +03:00
Georgi Gerganov 15b49e8baf
Bug fix
Longer prompts could cause out-of-bounds access
2022-09-30 20:37:29 +03:00
Georgi Gerganov 3bcdbdfc32
Reduce memory usage even more + better sampling
- The encode/decode memory buffers are now reused
- If the 30-sec segment goes for too long without a timestamp token, we
  force one. Improves transcription for large model
- Stereo support
- Add "micro-machines.wav" sample
2022-09-30 19:35:27 +03:00
Georgi Gerganov 5877c3578e
ref #4 : added transcription timestamps
Can be turned off with "-nt" argument.
Performance has also improved.
2022-09-29 23:09:39 +03:00
Georgi Gerganov f888c2373d
Flash + language support (ref #2)
- Achieved big performance improvement + memory usage reduction
- Can now translate / transcribe different languages
2022-09-28 21:07:32 +03:00
Georgi Gerganov 476182e439
Update README.md and simplify usage 2022-09-26 09:36:51 +03:00
Georgi Gerganov b0a11594ae
Initial release 2022-09-25 22:13:49 +03:00