Commit Graph

340 Commits (9a0b59d990be319952a4a02b9164b3b2327cd454)

Author SHA1 Message Date
F1L1P 2e2626b167
examples : Auto lowercase language parameter in main.cpp (#1928)
* Auto lowercase language parameter

* Update examples/main/main.cpp

Co-authored-by: bobqianic <129547291+bobqianic@users.noreply.github.com>

---------

Co-authored-by: bobqianic <129547291+bobqianic@users.noreply.github.com>
2024-03-06 22:25:10 +00:00
zhouwg c0c0ae2dea
examples : fix typo in bench.cpp (#1933) 2024-03-06 22:21:44 +00:00
zhouwg f22d27a385
whisper.android.java : fix returns in JNI (#1929) 2024-03-05 15:59:26 +02:00
Georgi Gerganov 25d313b38b
talk-llama : sync llama.cpp 2024-02-28 13:04:05 +02:00
Georgi Gerganov 1711bb3881
sync : llama.cpp (ggml/0) 2024-02-28 13:00:30 +02:00
Andrew S 0d8fd8483a
stream.wasm : fix invalid memory access when no segments (#1902)
No segments may be returned when a smaller sample buffer (EG 2048 samples) is sent to the worker.
2024-02-26 10:12:35 +02:00
Georgi Gerganov 3170841ed9
talk-llama : sync llama.cpp 2024-02-25 20:00:10 +02:00
Georgi Gerganov 578e47e70c
sync : llama.cpp (ggml/0) 2024-02-25 19:58:46 +02:00
Tamotsu Takahashi f18738f247
talk, talk-llama : pass text_to_speak as a file (#1865)
* talk-llama: pass file instead of arg

it is too hard to quote text in a portable way

* talk-llama: pass heard_ok as a file

* talk-llama: let eleven-labs.py accept options

Options: -v voice, -s savefile, -p (--play)

* talk-llama: check installed commands in "speak"

Pass "-q" to eleven-labs.py to skip checking whether elevenlabs is installed

* talk-llama: pass voice_id again

in order to sync talk with talk-llama

* talk: sync with talk-llama

Passing text_to_speak as a file is safer and more portable
cf. https://stackoverflow.com/a/59036879/45375

* talk and talk-llama: get all installed voices in speak.ps1

* talk and talk-llama: get voices from api

* talk and talk-llama: add more options to eleven-labs.py

and remove DEFAULT_VOICE because it is deprecated (https://www.reddit.com/r/ElevenLabs/comments/1830abt/what_happened_to_bella/)

```
usage: eleven-labs.py [-q] [-l] [-h] [-n NAME | -v NUMBER] [-f KEY=VAL] [-s FILE | -p] [TEXTFILE]

options:
  -q, --quick           skip checking the required library

action:
  TEXTFILE              read the text file (default: stdin)
  -l, --list            show the list of voices and exit
  -h, --help            show this help and exit

voice selection:
  -n NAME, --name NAME  get a voice object by name (default: Arnold)
  -v NUMBER, --voice NUMBER
                        get a voice object by number (see --list)
  -f KEY=VAL, --filter KEY=VAL
                        filter voices by labels (default: "use case=narration")
                        this option can be used multiple times
                        filtering will be disabled if the first -f has no "=" (e.g. -f "any")

output:
  -s FILE, --save FILE  save the TTS to a file (default: audio.mp3)
  -p, --play            play the TTS with ffplay
```

* examples: add speak_with_file()

as suggested in the review

* talk and talk-llama: ignore to_speak.txt
2024-02-24 09:24:47 +02:00
Abhilash Majumder a0ddd8392c
whisper : add SYCL support (#1863)
* add changes from llama upstream

* add sycl abstraction

* add sycl build

* update cmake

* add sycl build config

* fix bug

* fix bug

* refactor build

* fix bug

* update build

* call build

* use sycl header

* add examples

* add target

* fix typecast in quant.c

* readd fp16 and readme

* fix quant typecast

* add sample

* add readme

* remove cxx file check
2024-02-23 09:22:24 +02:00
Georgi Gerganov a2506909b1
talk-llama : sync llama.cpp 2024-02-22 23:30:53 +02:00
Georgi Gerganov 5fdb27ff80
ggml : 32-bit arm compat (#1891)
* ggml : 32-bit arm compat

* ggml : add ggml_vqtbl1q_s8 impl

* ggml : cont
2024-02-22 18:31:40 +02:00
Georgi Gerganov ce411498f6
sync : llama.cpp (ggml/0)
ggml-ci
2024-02-22 15:12:36 +02:00
Davidson Francis c56344b509
main : fix file existence check in main.cpp (#1889)
In commit dda4b0e of PR #1872, I've introduced a check for the
existence of files before loading the model. However, I haven't
considered the case where whisper.cpp might read from stdin as well,
and in such cases, the checks should ignore the "-" argument as it
does not represent a regular file.

Additionally, this commit removes the usage of 'stat()' in favor of
the recently introduced function 'is_file_exist()' in common.cpp from
PR #1871.

Apologies for the bug introduced in the previous PR and any
inconvenience it may have caused.
2024-02-22 15:01:08 +02:00
Georgi Gerganov 59119f4f20
talk-llama : sync llama.cpp 2024-02-20 12:09:57 +02:00
Georgi Gerganov 83afebe872
common : add IQ1_S (ggml/0)
ggml-ci
2024-02-19 15:53:25 +02:00
Davidson Francis dda4b0ed06
main : check if input files exist before proceeding (#1872)
Until the most recent commit (3d42463), the main.cpp sample file does
not check whether the input files exist or not. Consequently, the
model is loaded first before reporting whether there was a failure or
not when processing a file. In environments with HDD, this can take
about 50 seconds or more, depending on the loaded model.

This commit addresses this issue by checking in advance whether the
input files exist or not.
2024-02-19 10:51:26 +02:00
Felix 07d04280be
examples : clean up common code (#1871)
move some utility functions into common.h
2024-02-19 10:50:15 +02:00
Georgi Gerganov 551529290d
talk-llama : sync llama.cpp 2024-02-12 10:39:58 +02:00
dscripka a6fb6ab597
examples : added audio_ctx argument to main and server (#1857)
* added audio_ctx argument to main and server examples

* Better default value

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* better default value (again)

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-12 09:19:07 +02:00
Georgi Gerganov f273e66dc6
examples : initialize context params properly (#1852) 2024-02-11 16:39:12 +02:00
Georgi Gerganov 02b4c52c12
talk-llama : sync llama.cpp 2024-02-10 10:10:59 +02:00
Valentin Gosu 80e8a2ea39
server : allow CORS request with authorization headers (#1850)
Whisper plugin in Obsidian requires an API key which is
then sent as an authorization header.
However, the presence of an authorization header requires
a CORS Preflight, so both the OPTIONS method and
the Access-Control-Allow-Headers: authorization must be
handled.
2024-02-09 17:42:41 +02:00
Neuman Vong 19f8048139
whisper.android : how to build with CLBlast (#1809)
* FetchContent

* OpenCL

* Documentation and make optional

* Specify GGML build options in build.gradle

* Use gradle properties

* @ggerganov

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* @gpokat

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-09 17:39:05 +02:00
Georgi Gerganov 434b8f3b96
talk-llama : stream response (#1121) 2024-02-06 19:56:12 +02:00
Georgi Gerganov 7a74e929c8
sync : ggml (#0) 2024-01-30 21:30:26 +02:00
JacobLinCool ae5c4f7340
common : fix wav buffer detection (#1819) 2024-01-30 19:35:08 +02:00
JacobLinCool baa30bacdb
server : add fields to `verbose_json` response (#1802)
* server: include additional fields in the verbose_json response as OpenAI does

* server: show request examples on home page

* server: todo note for compression_ratio and no_speech_prob

* server: add simple demo form to the homepage
2024-01-30 14:15:55 +02:00
Georgi Gerganov e72e4158de
talk-llama : sync llama.cpp 2024-01-28 19:44:10 +02:00
Georgi Gerganov 52cce82493
common : fix input buffer check (#1812) 2024-01-27 17:33:09 +02:00
Georgi Gerganov ef3c9ed9eb
talk-llama : sync llama.cpp 2024-01-27 17:24:53 +02:00
Michael Rienstra 4bbb60efce
docs : make model options / model install methods clearer (#1806)
* Make models more "discoverable"

* Clean up code block language identifiers

* make 3 options clearer

* undo Prettier formatter change

* docs: `$` shell prompt, consistently

* docs: minor changes
2024-01-26 17:39:54 +02:00
Neuman Vong d6b9be21d7
whisper.android : return output from benchmarks (#1785)
Benchmarks are failing because JNI expects a jstring and the benchmarks
are missing a return statement (i.e., returning null). The functions
actually build a jstring but don't return it, so this seems to have been
an oversight.

This patch returns the jstring and now the benchmarks run successfully.

Fixes #1783.
2024-01-19 16:17:38 +02:00
Ryan Hitchman c0329acde8
server : implement "verbose_json" format with token details (#1781)
* examples/server: implement "verbose_json" format with token details.

This is intended to mirror the format of openai's Python
whisper.transcribe() return values.

* server: don't write WAV to a temporary file if not converting

* server: use std::lock_guard instead of manual lock/unlock
2024-01-18 22:58:42 +02:00
Georgi Gerganov 1f50a7d29f
sync : llama.cpp 2024-01-17 21:23:33 +02:00
Benjamin Heiniger f6614155e4
talk-llama : optional wake-up command and audio confirmation (#1765)
* talk-llama: add optional wake-word detection from command

* talk-llama: add optional audio confirmation before generating answer

* talk-llama: fix small formatting issue in output

* talk-llama.cpp: fix Windows build
2024-01-16 15:52:01 +02:00
Przemysław Pawełczyk f5f159c320
server : fix building and simplify lib deps on Windows (#1772)
* make : fix server example building on MSYS2 environments (Windows)

It was not working since commit eff3570f78
when server was introduced.

* cmake : simplify server example lib deps on Windows

server uses httplib::Server, not httplib::SSLServer, so there is no need
to mention cryptographic libraries in target_link_libraries.
Winsock (ws2_32) suffices here.

Also use plain library names like we use in other places.
2024-01-15 15:48:13 +02:00
Georgi Gerganov 6ebba525f1
talk-llama : sync llama.cpp 2024-01-14 18:08:20 +02:00
Georgi Gerganov 2a5874441d
talk-llama : llama.cpp 2024-01-14 11:06:28 +02:00
Georgi Gerganov d08445c9ad
sync : ggml 2024-01-14 10:55:18 +02:00
Georgi Gerganov f001a3b7b6
talk-llama : sync llama.cpp 2024-01-14 00:13:17 +02:00
RhinoDevel db078a9ba8
talk-llama : add optional CLI arg to set the bot name (#1764) 2024-01-13 20:51:35 +02:00
james wolf a13a7da5ad
examples : add python example for transcription (#1744)
* rebase and add simple python interface

* moved python files to examples/python
2024-01-13 19:37:18 +02:00
Georgi Gerganov 40ae0962f4
talk-llama : sync llama.cpp 2024-01-12 22:04:51 +02:00
George Hindle fbcb52d3cd
server : add more parameters to server api (#1754)
* feat(server): add more parameters to server api

* fix(server): reset params to original parsed values for each request
2024-01-12 13:42:52 +02:00
George Hindle f7908f9bb8
params : don't compute timestamps when not printing them (#1755) 2024-01-12 13:24:38 +02:00
Georgi Gerganov 00b7a4be02
talk-llama : sync llama.cpp 2024-01-11 22:10:10 +02:00
Georgi Gerganov 32e71a1861
sync : ggml 2024-01-11 21:54:17 +02:00
Georgi Gerganov 9c857cf280
sync : llama.cpp 2024-01-11 21:50:01 +02:00
RhinoDevel bcc1658cd0
talk-llama : add optional Piper TTS support (#1749)
Add commented-out command to optionally use Piper (https://github.com/rhasspy/piper) as text-to-speech solution for the talk-llama example. Piper voices sound almost like real people which is a big improvement (e.g.) from something like espeak.
2024-01-10 16:15:28 +02:00