Commit Graph

129 Commits (c6710efde20e858e8588aac65dc1f96f5a815f31)

Author SHA1 Message Date
Georgi Gerganov c6710efde2 refactoring : move main + stream in examples + other stuff 2022-10-25 20:53:48 +03:00
Georgi Gerganov 4c68f4cac0
main : fix SRT timestamp to use comma "," instead of dot "." 2022-10-24 18:28:23 +03:00
Georgi Gerganov 728676927f
Update README.md 2022-10-24 18:26:21 +03:00
Georgi Gerganov d4f94ce427 Update README.md 2022-10-24 18:23:07 +03:00
Georgi Gerganov a52ee08c1e objc : polishing the sample application 2022-10-24 18:23:07 +03:00
Georgi Gerganov b41f4a90eb Create README.md 2022-10-24 18:23:07 +03:00
Georgi Gerganov bb1ee266d2 ios : whisper.objc example 2022-10-24 18:23:07 +03:00
Georgi Gerganov 5f7e9fa2dc
ref #68, #79 : fix segment time output 2022-10-23 13:30:30 +03:00
Georgi Gerganov 181b762de8
Update README.md 2022-10-23 12:47:51 +03:00
Georgi Gerganov 3d37ad5133
Merge pull request #78 from jokkebk/Specify-utf8-for-vocab.json
Add enconding parameter to vocab.json opening to fix errors
2022-10-23 12:23:04 +03:00
Joonas Pihlajamaa 4e887dc350 Add enconding parameter to vocab.json opening to fix errors 2022-10-23 11:55:01 +03:00
Georgi Gerganov 4196856c7b
Update README.md 2022-10-23 10:24:36 +03:00
Georgi Gerganov 705198f063
Update README.md 2022-10-23 10:12:10 +03:00
Georgi Gerganov 3e69a6071d
Update README.md 2022-10-23 08:04:33 +03:00
Georgi Gerganov f3dae90c31 Update README.md 2022-10-22 21:17:21 +03:00
Georgi Gerganov 6a81ed3e78 main : print colors + no timestamps 2022-10-22 21:17:21 +03:00
Georgi Gerganov 7affd309d3 whisper : add new-segment callback
Can be used to process new segments as they are being generated.
Sample usage in main, for printing the resulting segments during the
inference.
2022-10-22 21:17:21 +03:00
Georgi Gerganov 8f95c25aed main : refactor subtitle output 2022-10-22 21:17:21 +03:00
Georgi Gerganov 31ff0c6a1f wip : experimental color coding of tokens based on probabilities 2022-10-22 21:17:21 +03:00
Georgi Gerganov f4aa01c2f8
Update README.md 2022-10-22 19:30:35 +03:00
Georgi Gerganov 8c1d970088
Update README.md 2022-10-22 19:00:25 +03:00
Georgi Gerganov 6b45e37b2b Update README.md and finalize the whisper.wasm example 2022-10-22 18:54:01 +03:00
Georgi Gerganov 491ecd7056 wip : polishing WASM example 2022-10-22 18:54:01 +03:00
Georgi Gerganov db460b78ff wip : WASM 128-bit SIMD support 2022-10-22 18:54:01 +03:00
Georgi Gerganov e905c6f827 wip : initial WASM port
Works but it is very slow because no SIMD is used.
For example, jfk.wav is processed in ~23 seconds using "tiny.en" model
2022-10-22 18:54:01 +03:00
Georgi Gerganov 7d0dee7a8a
ref #68 : add option "-on" to specify segment index offset for SRT
Also, change option "-o" to "-ot"
2022-10-21 18:14:53 +03:00
Georgi Gerganov 8d15a1c635
ci : fix and re-enable tests (2nd try) 2022-10-21 15:57:20 +03:00
Georgi Gerganov 692aa0784f
Revert "ci : fix and re-enable tests"
This reverts commit 80aefc9514.
2022-10-21 15:36:19 +03:00
Georgi Gerganov 80aefc9514
ci : fix and re-enable tests 2022-10-21 15:27:30 +03:00
Georgi Gerganov 5698b51718
Update README.md 2022-10-20 17:52:59 +03:00
Georgi Gerganov 3fe3898ebb
Update README.md 2022-10-20 17:43:56 +03:00
Georgi Gerganov 81c185576c
Update README.md 2022-10-20 17:39:31 +03:00
Georgi Gerganov 744bd47685
Merge pull request #67 from undefdev/defensive-apple-arm-make
added handling for falsely as x86_64 announced ARM Macs
2022-10-19 09:29:43 +03:00
Georgi Gerganov 66b3169d39
ci : disable tests temporarily 2022-10-19 08:37:18 +03:00
undef 19a780afe5 added handling for falsely as x86_64 announced ARM Macs 2022-10-19 01:01:53 +02:00
Georgi Gerganov 1969ee4bc7
Update README.md 2022-10-18 22:20:35 +03:00
Georgi Gerganov 0e4fd43400
stream : print warning when processing is not fast enough 2022-10-18 20:15:06 +03:00
Georgi Gerganov 19817711b4
Add reference to FP16 repo 2022-10-18 19:48:34 +03:00
Georgi Gerganov 7eeef0358a
ref #52 : improve greedy sampling strategy
Force timestamp token to be sampled if the probability sum over all
timestamp tokens is above the probability of any other token
2022-10-18 19:48:15 +03:00
Georgi Gerganov 632660abb9
CMake support for Accelerate framework 2022-10-18 18:51:59 +03:00
Georgi Gerganov e36aabe00d
Correct implementation of FP16 GELU
Can toggle it via the GGML_GELU_FP16 macro
2022-10-18 18:42:08 +03:00
Georgi Gerganov 2d171ced32
close #32 : add comment about thread-safety of the C-style API 2022-10-18 18:27:57 +03:00
Georgi Gerganov e30cf83158
ref #57, #62, #63 : remove unions in C-api + remove designated initializers
We are not ready for designated initializers - many compilers do not
support this C++ feature yet, so removing it's non-trivial usages.
2022-10-18 18:17:24 +03:00
Georgi Gerganov d6b84b2a23
ref #62 : fix build for some compilers
For some reason, new version of GCC panic when the struct type is not
specified explicitly
2022-10-18 10:57:03 +03:00
Georgi Gerganov b4a3875b2c
Revert recent sampling change
It does not actually help and seems to produce worse results on some of
the samples
2022-10-18 08:26:16 +03:00
Georgi Gerganov cf67bfffa0 Fix EOT token handling
If it is the end of the audio, pick all sampled tokens.
Otherwise, print error message.
2022-10-18 00:53:06 +03:00
Georgi Gerganov 91632eb6ea Revert GELU change
Seems it does not work on x86 for some reason
2022-10-18 00:45:08 +03:00
Georgi Gerganov b81a81d543 Link Accelerate framework to "stream" example 2022-10-18 00:12:51 +03:00
Georgi Gerganov d14823582d Try to improve the sampling strategy a bit
It sill fails sometimes when it does not sample a timestamp token for
the entire segment. We now print a message in such cases
2022-10-18 00:12:51 +03:00
Georgi Gerganov 20d8e7a309 Fix memory sizes 2022-10-18 00:12:51 +03:00