Commit graph

329 commits

Author SHA1 Message Date
Georgi Gerganov 72d967bce4 Use Accelerate framework on Apple silicon
Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro)

Also various extra optimizations:

- Multi-threaded NORM operator
- Faster GELU via F16 cast
2022-10-18 00:12:51 +03:00
Georgi Gerganov 130b5c02d6 Adding helper script for converting the PT models 2022-10-18 00:12:51 +03:00
Georgi Gerganov 0e858f080d
close #56 : build on FreeBSD
Thanks to @abelbabel for the contribution
2022-10-17 18:10:16 +03:00
Georgi Gerganov f24d940ca9
Merge pull request #58 from r0y6a3n0/master
fix decode missing token issue
2022-10-17 18:06:02 +03:00
RyanChang 949f97a8b4 fix missing token issue 2022-10-17 21:19:45 +08:00
Georgi Gerganov 0ad085f5e8
ref #48 : clear results at the start of whisper_full
This way, even if the input audio is empty, the previous results will be
removed.
2022-10-15 09:55:28 +03:00
Georgi Gerganov 36945162fa
Update README.md (ref #50) 2022-10-15 09:40:08 +03:00
Georgi Gerganov b2f1600aa3
Update README.md 2022-10-12 21:25:42 +03:00
0/0 b799226973 check if spectogram length is <100 before doing anything else
fixes #39
2022-10-12 07:32:42 +03:00
Topping1 1348796a93
Update README.md (#43)
* Update README.md

Updated README.md to list new features, such as subtitle file support (VTT and SRT)

* Update README.md

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2022-10-12 07:32:14 +03:00
Georgi Gerganov 40609cb49b
Merge pull request #42 from iboB/msvc-build
ref #5 : MSVC build
2022-10-12 07:31:41 +03:00
Borislav Stanimirov 0b45d25151 Building with MSVC 2022-10-11 21:40:46 +03:00
Borislav Stanimirov 28252352d7 Visual Studio ignored dirs 2022-10-11 20:57:33 +03:00
Georgi Gerganov 8d94358251
Update README.md 2022-10-11 00:36:32 +03:00
Georgi Gerganov ad6693fb64
Update README.md 2022-10-10 22:16:25 +03:00
Georgi Gerganov 01c9e96f64
stream : improve real-time transcription 2022-10-10 22:06:27 +03:00
Georgi Gerganov 63b6786767
Minor 2022-10-10 22:06:27 +03:00
Georgi Gerganov f7ab81fe51
Update README.md 2022-10-10 22:05:37 +03:00
Georgi Gerganov eac4f12777
Merge pull request #36 from Topping1/master
Fix SRT timestamp format from mm:ss.sss to hh:mm:ss.sss
2022-10-10 09:13:31 +03:00
Georgi Gerganov 9d5723435f
ref #35 : add <stdbool.h> to whisper.h
"bool" type is not implicitly defined for some compilers.
2022-10-10 08:11:18 +03:00
Georgi Gerganov 6e29d8453c
Merge pull request #34 from tazz4843/master
Add static library make target
2022-10-10 08:05:57 +03:00
Topping1 50b5fe964c
Update main.cpp 2022-10-09 23:35:10 -05:00
0/0 64752acd27
add static library make target 2022-10-09 19:16:42 -06:00
Georgi Gerganov 7edaa7da4b
Merge pull request #31 from lkwq007/master
Add MinGW support
2022-10-09 17:52:46 +03:00
lnyan 4bbb8a587b Add MinGW support 2022-10-09 22:26:37 +08:00
Georgi Gerganov 4a6bf11db3 Minor 2022-10-08 18:13:26 +03:00
Georgi Gerganov 9bbca3110f ref #9 : add API documentation in whisper.h 2022-10-08 18:09:56 +03:00
Georgi Gerganov 5e563ef635 Fix Makefile for MacBook Intel 2022-10-08 17:35:55 +03:00
Georgi Gerganov 2ca8cc77b2 ref #17 : print whisper logs to stderr
Only the transcribed/translted text is printed to stdout.
This way, one can redirect the result to a file.
2022-10-08 17:28:06 +03:00
Georgi Gerganov 8c7c018893 ref #17 : add options to output result to file
Support for:

- plain text
- VTT
- SRT
2022-10-08 17:22:22 +03:00
Georgi Gerganov 4c4ab71d4d
Update README.md 2022-10-08 11:46:34 +03:00
Georgi Gerganov b43b36e006 Update tests 2022-10-08 11:43:42 +03:00
Georgi Gerganov 37110d693e ci : add base model tests to GH Actions 2022-10-08 11:43:42 +03:00
Georgi Gerganov 2d47693435 Update README.md 2022-10-08 11:43:42 +03:00
Georgi Gerganov a53e06757f Create README.md 2022-10-08 11:43:42 +03:00
Georgi Gerganov 0e3ba2f9fc Adding dummy models for testing purposes 2022-10-08 11:43:42 +03:00
Georgi Gerganov 2f069335ab Adding sanitizer tests 2022-10-08 11:43:42 +03:00
Georgi Gerganov 29b041f79b Cleanup CMakeLists.txt 2022-10-08 09:02:41 +03:00
Georgi Gerganov 4a732b2879 cmake : fixes 2022-10-08 09:02:41 +03:00
Georgi Gerganov 68f5962be6 ci : add cmake builds 2022-10-08 09:02:41 +03:00
Georgi Gerganov 332c9d77fe whisper : fix bug in token sampling logic
Could overflow buffer
2022-10-08 09:02:41 +03:00
Georgi Gerganov 877c058179 Add CMake support 2022-10-08 09:02:41 +03:00
Georgi Gerganov 481cd685d5
ref #10 : option to keep context in "stream" example
Seems the results become worse when we keep the context, so by default
this is not enabled
2022-10-07 22:30:44 +03:00
Georgi Gerganov 3f15bb8a08
ref #10 : add "step" argument for "stream" example
Controls how often we run the inference.
By default, we run it every 3 seconds.
2022-10-07 22:07:24 +03:00
Georgi Gerganov 7787b878e1
ref #16, #22 : add "offset" argument
Allows to start processing the input audio at some offset from the
beginning. Useful for splitting a long job into multiple tasks.
2022-10-07 22:00:40 +03:00
Georgi Gerganov e29a5dacc6
ref #11, #18, #26 : fix CACHE_LINE_SIZE constant 2022-10-07 21:56:44 +03:00
Georgi Gerganov 844d60b284 Add CI using Github Actions 2022-10-07 18:34:27 +03:00
Georgi Gerganov 700898e6ed
ref #22 : add option to provide multiple input .wav files 2022-10-05 23:44:10 +03:00
Georgi Gerganov 6b1c3cc198
Update README.md 2022-10-05 23:13:15 +03:00
Georgi Gerganov b8f713482e
Minor updates 2022-10-05 23:11:02 +03:00