Commit graph

241 commits

Author SHA1 Message Date
Georgi Gerganov 7eeef0358a
ref #52 : improve greedy sampling strategy
Force timestamp token to be sampled if the probability sum over all
timestamp tokens is above the probability of any other token
2022-10-18 19:48:15 +03:00
Georgi Gerganov 632660abb9
CMake support for Accelerate framework 2022-10-18 18:51:59 +03:00
Georgi Gerganov e36aabe00d
Correct implementation of FP16 GELU
Can toggle it via the GGML_GELU_FP16 macro
2022-10-18 18:42:08 +03:00
Georgi Gerganov 2d171ced32
close #32 : add comment about thread-safety of the C-style API 2022-10-18 18:27:57 +03:00
Georgi Gerganov e30cf83158
ref #57, #62, #63 : remove unions in C-api + remove designated initializers
We are not ready for designated initializers - many compilers do not
support this C++ feature yet, so removing it's non-trivial usages.
2022-10-18 18:17:24 +03:00
Georgi Gerganov d6b84b2a23
ref #62 : fix build for some compilers
For some reason, new version of GCC panic when the struct type is not
specified explicitly
2022-10-18 10:57:03 +03:00
Georgi Gerganov b4a3875b2c
Revert recent sampling change
It does not actually help and seems to produce worse results on some of
the samples
2022-10-18 08:26:16 +03:00
Georgi Gerganov cf67bfffa0 Fix EOT token handling
If it is the end of the audio, pick all sampled tokens.
Otherwise, print error message.
2022-10-18 00:53:06 +03:00
Georgi Gerganov 91632eb6ea Revert GELU change
Seems it does not work on x86 for some reason
2022-10-18 00:45:08 +03:00
Georgi Gerganov b81a81d543 Link Accelerate framework to "stream" example 2022-10-18 00:12:51 +03:00
Georgi Gerganov d14823582d Try to improve the sampling strategy a bit
It sill fails sometimes when it does not sample a timestamp token for
the entire segment. We now print a message in such cases
2022-10-18 00:12:51 +03:00
Georgi Gerganov 20d8e7a309 Fix memory sizes 2022-10-18 00:12:51 +03:00
Georgi Gerganov 72d967bce4 Use Accelerate framework on Apple silicon
Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro)

Also various extra optimizations:

- Multi-threaded NORM operator
- Faster GELU via F16 cast
2022-10-18 00:12:51 +03:00
Georgi Gerganov 130b5c02d6 Adding helper script for converting the PT models 2022-10-18 00:12:51 +03:00
Georgi Gerganov 0e858f080d
close #56 : build on FreeBSD
Thanks to @abelbabel for the contribution
2022-10-17 18:10:16 +03:00
Georgi Gerganov f24d940ca9
Merge pull request #58 from r0y6a3n0/master
fix decode missing token issue
2022-10-17 18:06:02 +03:00
RyanChang 949f97a8b4 fix missing token issue 2022-10-17 21:19:45 +08:00
Georgi Gerganov 0ad085f5e8
ref #48 : clear results at the start of whisper_full
This way, even if the input audio is empty, the previous results will be
removed.
2022-10-15 09:55:28 +03:00
Georgi Gerganov 36945162fa
Update README.md (ref #50) 2022-10-15 09:40:08 +03:00
Georgi Gerganov b2f1600aa3
Update README.md 2022-10-12 21:25:42 +03:00
0/0 b799226973 check if spectogram length is <100 before doing anything else
fixes #39
2022-10-12 07:32:42 +03:00
Topping1 1348796a93
Update README.md (#43)
* Update README.md

Updated README.md to list new features, such as subtitle file support (VTT and SRT)

* Update README.md

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2022-10-12 07:32:14 +03:00
Georgi Gerganov 40609cb49b
Merge pull request #42 from iboB/msvc-build
ref #5 : MSVC build
2022-10-12 07:31:41 +03:00
Borislav Stanimirov 0b45d25151 Building with MSVC 2022-10-11 21:40:46 +03:00
Borislav Stanimirov 28252352d7 Visual Studio ignored dirs 2022-10-11 20:57:33 +03:00
Georgi Gerganov 8d94358251
Update README.md 2022-10-11 00:36:32 +03:00
Georgi Gerganov ad6693fb64
Update README.md 2022-10-10 22:16:25 +03:00
Georgi Gerganov 01c9e96f64
stream : improve real-time transcription 2022-10-10 22:06:27 +03:00
Georgi Gerganov 63b6786767
Minor 2022-10-10 22:06:27 +03:00
Georgi Gerganov f7ab81fe51
Update README.md 2022-10-10 22:05:37 +03:00
Georgi Gerganov eac4f12777
Merge pull request #36 from Topping1/master
Fix SRT timestamp format from mm:ss.sss to hh:mm:ss.sss
2022-10-10 09:13:31 +03:00
Georgi Gerganov 9d5723435f
ref #35 : add <stdbool.h> to whisper.h
"bool" type is not implicitly defined for some compilers.
2022-10-10 08:11:18 +03:00
Georgi Gerganov 6e29d8453c
Merge pull request #34 from tazz4843/master
Add static library make target
2022-10-10 08:05:57 +03:00
Topping1 50b5fe964c
Update main.cpp 2022-10-09 23:35:10 -05:00
0/0 64752acd27
add static library make target 2022-10-09 19:16:42 -06:00
Georgi Gerganov 7edaa7da4b
Merge pull request #31 from lkwq007/master
Add MinGW support
2022-10-09 17:52:46 +03:00
lnyan 4bbb8a587b Add MinGW support 2022-10-09 22:26:37 +08:00
Georgi Gerganov 4a6bf11db3 Minor 2022-10-08 18:13:26 +03:00
Georgi Gerganov 9bbca3110f ref #9 : add API documentation in whisper.h 2022-10-08 18:09:56 +03:00
Georgi Gerganov 5e563ef635 Fix Makefile for MacBook Intel 2022-10-08 17:35:55 +03:00
Georgi Gerganov 2ca8cc77b2 ref #17 : print whisper logs to stderr
Only the transcribed/translted text is printed to stdout.
This way, one can redirect the result to a file.
2022-10-08 17:28:06 +03:00
Georgi Gerganov 8c7c018893 ref #17 : add options to output result to file
Support for:

- plain text
- VTT
- SRT
2022-10-08 17:22:22 +03:00
Georgi Gerganov 4c4ab71d4d
Update README.md 2022-10-08 11:46:34 +03:00
Georgi Gerganov b43b36e006 Update tests 2022-10-08 11:43:42 +03:00
Georgi Gerganov 37110d693e ci : add base model tests to GH Actions 2022-10-08 11:43:42 +03:00
Georgi Gerganov 2d47693435 Update README.md 2022-10-08 11:43:42 +03:00
Georgi Gerganov a53e06757f Create README.md 2022-10-08 11:43:42 +03:00
Georgi Gerganov 0e3ba2f9fc Adding dummy models for testing purposes 2022-10-08 11:43:42 +03:00
Georgi Gerganov 2f069335ab Adding sanitizer tests 2022-10-08 11:43:42 +03:00
Georgi Gerganov 29b041f79b Cleanup CMakeLists.txt 2022-10-08 09:02:41 +03:00