Commit Graph

83 Commits (master)

Author SHA1 Message Date
Syahmi Azhar 1512545149
whisper : add loader class to allow loading from buffer and others (#353)
* whisper : add loader to allow loading from other than file

* whisper : rename whisper_init to whisper_init_from_file

* whisper : add whisper_init_from_buffer

* android : Delete local.properties

* android : load models directly from assets

* whisper : adding <stddef.h> needed for size_t + code style

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-01-08 13:03:33 +02:00
Georgi Gerganov b3c865083e
ci : add emscripten build 2023-01-05 22:10:20 +02:00
Georgi Gerganov a0d4f8e65c
main : make whisper_print_segment_callback() more readable (close #371) 2023-01-05 21:45:05 +02:00
Georgi Gerganov 196d738974
minor : close #370 + Makefile build info print change 2023-01-05 21:35:45 +02:00
Niels Mayer a593b932e4
main : add -ocsv, aka --output-csv to output a CSV file
Adds -ocsv, aka --output-csv feature to examples/main, which outputs a CSV file containing lines formatted as follows <startTime-in-integer-milliseconds>, <endTime-in-integer-milliseconds>, "<transcript-line-including-commas>".
2022-12-29 14:04:00 +02:00
Andy Maloney dc90efd504
examples : small code cleanups (#322)
- remove unnecessary initialization of string to ""
- use empty() instead of checking size()
- use emplace_back instead of push_back
- use nullptr instead of NULL
- remove unnecessary call to .data() on string
- use character overload of find_first_of() instead of passing a string
2022-12-23 20:18:51 +02:00
Georgi Gerganov 99da1e5cc8
cmake : enable and fix -Wall -Wextra -Wpedantic C++ warnings 2022-12-19 20:45:08 +02:00
Matheus de Sousa 8e3f129b4d
minor : resolves some of warnings when compiling with clang/clang++ (#294)
* Resolves some of warnings when compiling with clang/clang++

Mostly nit stuff that clang catches when compiling with -Wall -Wextra
-pedantic.

- Fix comparison between sign/unsigned integers.
- Passes a constant reference (const&) instead of copying each time.

* minor : normalize coding style

* minor : fix warning

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2022-12-19 20:19:01 +02:00
Georgi Gerganov fba10a4c68 whisper : language auto-detect (#59) 2022-12-17 18:49:44 +02:00
Georgi Gerganov 32fbc8cd04
main : add option to print the progress (#276) 2022-12-16 20:20:43 +02:00
Georgi Gerganov b8065d90f5
main : add "--prompt" command line argument (#90)
This allows to provide an initial prompt to be used at the start of the
processing.
2022-12-16 19:43:16 +02:00
Lexevolution 6ed786957e
Add newline per segment for text output (#254) 2022-12-11 20:00:29 +02:00
Georgi Gerganov 4698dcdb52 whisper : add mechanism for aborting the whisper_full() computation 2022-11-27 20:42:45 +02:00
Georgi Gerganov 0f619b52ce
main : add stereo-channel-based diarization (#64)
Not tested - I don't have stereo dialog audio
2022-11-25 22:08:58 +02:00
Georgi Gerganov bc88eb13c6
examples : add "command" tool (#171) 2022-11-25 19:36:57 +02:00
Georgi Gerganov b8ce25dec1
refactoring : more readable code 2022-11-25 19:28:04 +02:00
Georgi Gerganov 454b91de16
main : fix dangling pointer when using stdin for input (#65) 2022-11-24 17:53:51 +02:00
Georgi Gerganov d7024cf9dc
main, stream : remove --verbose flag (#178) 2022-11-24 17:52:04 +02:00
Georgi Gerganov e5dcdabbb8
unicode : fix character replacement (thanks to @tamo) 2022-11-23 08:24:29 +02:00
Georgi Gerganov 83c742f1a7 whisper : add option to speed up the audio tempo by x2
Using a Phase Vocoder for speeding up the audio tempo by scaling down
the frequencies in the frequency domain.

This reduces the computation in the Encoder by a factor of 2.
The transcription accuracy is degraded, but for slow to normal speech -
it seems to be still very good.

I think this can find application for real-time transcription - i.e. the
"stream" example.
2022-11-13 16:25:43 +02:00
Alan 7519eabf65 Adds support for stdin wav input 2022-11-09 20:37:23 +02:00
Georgi Gerganov c30bffc8a5
ref #22 : add "duration" option
Can be used to partially process a recording
2022-11-07 20:14:52 +02:00
Georgi Gerganov ef47d77492
main : fix generated bash script 2022-11-04 18:30:38 +02:00
Georgi Gerganov d5afebd37c
whisper : token-level timestamp refactoring (#49, #120)
This turned out pretty good overall. The algorithm has been moved from
main.cpp to whisper.cpp and can be reused for all subtitles types. This
means that now you can specify the maximum length of the generated
lines. Simply provide the "-ml" argument specifying the max length in
number of characters
2022-11-02 21:45:54 +02:00
Georgi Gerganov 6fb98370ba
main : add some comments for the word-level timestamp algorithm 2022-11-01 22:35:21 +02:00
Georgi Gerganov 0729da9a3b
main : fix some edge cases for word-level timestamps 2022-11-01 22:09:25 +02:00
Georgi Gerganov 57fb46f307 main : add option for word-leve timestamps (very experimental) 2022-10-30 17:06:57 +02:00
Georgi Gerganov 2827cbbbe8 main : merge parallel example in main 2022-10-29 19:37:19 +03:00
Georgi Gerganov 0b2dc3c82c parallel : working 2022-10-29 19:37:19 +03:00
Georgi Gerganov 85d6e1e1e7 main : fix sampling time + add max_context parameter 2022-10-29 19:37:19 +03:00
Georgi Gerganov ebb01b9e33
Print system info at start of program 2022-10-27 17:22:19 +03:00
Georgi Gerganov 2400660f3f Print system info in main 2022-10-26 22:54:09 +03:00
Georgi Gerganov c6710efde2 refactoring : move main + stream in examples + other stuff 2022-10-25 20:53:48 +03:00