Commit Graph

329 Commits (e5d06cfc0f51a2f0845b235d21fdf9e814f19685)

Author SHA1 Message Date
Georgi Gerganov 5fdb27ff80
ggml : 32-bit arm compat (#1891)
* ggml : 32-bit arm compat

* ggml : add ggml_vqtbl1q_s8 impl

* ggml : cont
2024-02-22 18:31:40 +02:00
Georgi Gerganov ce411498f6
sync : llama.cpp (ggml/0)
ggml-ci
2024-02-22 15:12:36 +02:00
Davidson Francis c56344b509
main : fix file existence check in main.cpp (#1889)
In commit dda4b0e of PR #1872, I've introduced a check for the
existence of files before loading the model. However, I haven't
considered the case where whisper.cpp might read from stdin as well,
and in such cases, the checks should ignore the "-" argument as it
does not represent a regular file.

Additionally, this commit removes the usage of 'stat()' in favor of
the recently introduced function 'is_file_exist()' in common.cpp from
PR #1871.

Apologies for the bug introduced in the previous PR and any
inconvenience it may have caused.
2024-02-22 15:01:08 +02:00
Georgi Gerganov 59119f4f20
talk-llama : sync llama.cpp 2024-02-20 12:09:57 +02:00
Georgi Gerganov 83afebe872
common : add IQ1_S (ggml/0)
ggml-ci
2024-02-19 15:53:25 +02:00
Davidson Francis dda4b0ed06
main : check if input files exist before proceeding (#1872)
Until the most recent commit (3d42463), the main.cpp sample file does
not check whether the input files exist or not. Consequently, the
model is loaded first before reporting whether there was a failure or
not when processing a file. In environments with HDD, this can take
about 50 seconds or more, depending on the loaded model.

This commit addresses this issue by checking in advance whether the
input files exist or not.
2024-02-19 10:51:26 +02:00
Felix 07d04280be
examples : clean up common code (#1871)
move some utility functions into common.h
2024-02-19 10:50:15 +02:00
Georgi Gerganov 551529290d
talk-llama : sync llama.cpp 2024-02-12 10:39:58 +02:00
dscripka a6fb6ab597
examples : added audio_ctx argument to main and server (#1857)
* added audio_ctx argument to main and server examples

* Better default value

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* better default value (again)

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-12 09:19:07 +02:00
Georgi Gerganov f273e66dc6
examples : initialize context params properly (#1852) 2024-02-11 16:39:12 +02:00
Georgi Gerganov 02b4c52c12
talk-llama : sync llama.cpp 2024-02-10 10:10:59 +02:00
Valentin Gosu 80e8a2ea39
server : allow CORS request with authorization headers (#1850)
Whisper plugin in Obsidian requires an API key which is
then sent as an authorization header.
However, the presence of an authorization header requires
a CORS Preflight, so both the OPTIONS method and
the Access-Control-Allow-Headers: authorization must be
handled.
2024-02-09 17:42:41 +02:00
Neuman Vong 19f8048139
whisper.android : how to build with CLBlast (#1809)
* FetchContent

* OpenCL

* Documentation and make optional

* Specify GGML build options in build.gradle

* Use gradle properties

* @ggerganov

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* @gpokat

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-09 17:39:05 +02:00
Georgi Gerganov 434b8f3b96
talk-llama : stream response (#1121) 2024-02-06 19:56:12 +02:00
Georgi Gerganov 7a74e929c8
sync : ggml (#0) 2024-01-30 21:30:26 +02:00
JacobLinCool ae5c4f7340
common : fix wav buffer detection (#1819) 2024-01-30 19:35:08 +02:00
JacobLinCool baa30bacdb
server : add fields to `verbose_json` response (#1802)
* server: include additional fields in the verbose_json response as OpenAI does

* server: show request examples on home page

* server: todo note for compression_ratio and no_speech_prob

* server: add simple demo form to the homepage
2024-01-30 14:15:55 +02:00
Georgi Gerganov e72e4158de
talk-llama : sync llama.cpp 2024-01-28 19:44:10 +02:00
Georgi Gerganov 52cce82493
common : fix input buffer check (#1812) 2024-01-27 17:33:09 +02:00
Georgi Gerganov ef3c9ed9eb
talk-llama : sync llama.cpp 2024-01-27 17:24:53 +02:00
Michael Rienstra 4bbb60efce
docs : make model options / model install methods clearer (#1806)
* Make models more "discoverable"

* Clean up code block language identifiers

* make 3 options clearer

* undo Prettier formatter change

* docs: `$` shell prompt, consistently

* docs: minor changes
2024-01-26 17:39:54 +02:00
Neuman Vong d6b9be21d7
whisper.android : return output from benchmarks (#1785)
Benchmarks are failing because JNI expects a jstring and the benchmarks
are missing a return statement (i.e., returning null). The functions
actually build a jstring but don't return it, so this seems to have been
an oversight.

This patch returns the jstring and now the benchmarks run successfully.

Fixes #1783.
2024-01-19 16:17:38 +02:00
Ryan Hitchman c0329acde8
server : implement "verbose_json" format with token details (#1781)
* examples/server: implement "verbose_json" format with token details.

This is intended to mirror the format of openai's Python
whisper.transcribe() return values.

* server: don't write WAV to a temporary file if not converting

* server: use std::lock_guard instead of manual lock/unlock
2024-01-18 22:58:42 +02:00
Georgi Gerganov 1f50a7d29f
sync : llama.cpp 2024-01-17 21:23:33 +02:00
Benjamin Heiniger f6614155e4
talk-llama : optional wake-up command and audio confirmation (#1765)
* talk-llama: add optional wake-word detection from command

* talk-llama: add optional audio confirmation before generating answer

* talk-llama: fix small formatting issue in output

* talk-llama.cpp: fix Windows build
2024-01-16 15:52:01 +02:00
Przemysław Pawełczyk f5f159c320
server : fix building and simplify lib deps on Windows (#1772)
* make : fix server example building on MSYS2 environments (Windows)

It was not working since commit eff3570f78
when server was introduced.

* cmake : simplify server example lib deps on Windows

server uses httplib::Server, not httplib::SSLServer, so there is no need
to mention cryptographic libraries in target_link_libraries.
Winsock (ws2_32) suffices here.

Also use plain library names like we use in other places.
2024-01-15 15:48:13 +02:00
Georgi Gerganov 6ebba525f1
talk-llama : sync llama.cpp 2024-01-14 18:08:20 +02:00
Georgi Gerganov 2a5874441d
talk-llama : llama.cpp 2024-01-14 11:06:28 +02:00
Georgi Gerganov d08445c9ad
sync : ggml 2024-01-14 10:55:18 +02:00
Georgi Gerganov f001a3b7b6
talk-llama : sync llama.cpp 2024-01-14 00:13:17 +02:00
RhinoDevel db078a9ba8
talk-llama : add optional CLI arg to set the bot name (#1764) 2024-01-13 20:51:35 +02:00
james wolf a13a7da5ad
examples : add python example for transcription (#1744)
* rebase and add simple python interface

* moved python files to examples/python
2024-01-13 19:37:18 +02:00
Georgi Gerganov 40ae0962f4
talk-llama : sync llama.cpp 2024-01-12 22:04:51 +02:00
George Hindle fbcb52d3cd
server : add more parameters to server api (#1754)
* feat(server): add more parameters to server api

* fix(server): reset params to original parsed values for each request
2024-01-12 13:42:52 +02:00
George Hindle f7908f9bb8
params : don't compute timestamps when not printing them (#1755) 2024-01-12 13:24:38 +02:00
Georgi Gerganov 00b7a4be02
talk-llama : sync llama.cpp 2024-01-11 22:10:10 +02:00
Georgi Gerganov 32e71a1861
sync : ggml 2024-01-11 21:54:17 +02:00
Georgi Gerganov 9c857cf280
sync : llama.cpp 2024-01-11 21:50:01 +02:00
RhinoDevel bcc1658cd0
talk-llama : add optional Piper TTS support (#1749)
Add commented-out command to optionally use Piper (https://github.com/rhasspy/piper) as text-to-speech solution for the talk-llama example. Piper voices sound almost like real people which is a big improvement (e.g.) from something like espeak.
2024-01-10 16:15:28 +02:00
Emmanuel Schmidbauer c46886f599
server : add request path option(#1741) 2024-01-08 22:39:51 +00:00
Georgi Gerganov 29f78392c1
main : add cli option to disable system prints (#1740) 2024-01-08 16:41:28 +02:00
Georgi Gerganov 022756a872
server : fix server temperature + add temperature_inc (#1729)
* server : fix server temperature + add temperature_inc

* server : change dashes to underscores in parameter names
2024-01-07 13:35:14 +02:00
Georgi Gerganov 3b8c2dff57
talk-llama : sync latest llama.cpp 2024-01-06 17:22:57 +02:00
Georgi Gerganov ab0a8593c5
whisper.swiftui : add .gitignore 2024-01-04 15:00:27 +02:00
Tamotsu Takahashi d87de61ae6
ci : build with CLBlast + ggml-opencl use GGML_API (#1576)
* Build with CLBlast

* Declare GGML_API

After rebasing, examples/talk-llama failed:

"D:\a\whisper.cpp\whisper.cpp\build\ALL_BUILD.vcxproj" (build target) (1) ->
"D:\a\whisper.cpp\whisper.cpp\build\examples\talk-llama\talk-llama.vcxproj" (default target) (14) ->
(Link target) ->
  llama.obj : error LNK2019: unresolved external symbol ggml_cl_free_data referenced in function "public: __cdecl llama_model::~llama_model(void)" (??1llama_model@@QEAA@XZ) [D:\a\whisper.cpp\whisper.cpp\build\examples\talk-llama\talk-llama.vcxproj]
  llama.obj : error LNK2019: unresolved external symbol ggml_cl_transform_tensor referenced in function "public: void __cdecl llama_model_loader::load_all_data(struct ggml_context *,void (__cdecl*)(float,void *),void *,struct llama_mlock *)" (?load_all_data@llama_model_loader@@QEAAXPEAUggml_context@@P6AXMPEAX@Z1PEAUllama_mlock@@@Z) [D:\a\whisper.cpp\whisper.cpp\build\examples\talk-llama\talk-llama.vcxproj]
  D:\a\whisper.cpp\whisper.cpp\build\bin\Release\talk-llama.exe : fatal error LNK1120: 2 unresolved externals [D:\a\whisper.cpp\whisper.cpp\build\examples\talk-llama\talk-llama.vcxproj]
2023-12-29 12:23:27 +02:00
Georgi Gerganov 3a5302108d
sync : ggml (ggml_scale, ggml_row_size, etc.) (#1677)
* sync : ggml

* sync : llama.cpp

* talk-llama : fix obsolete param

* ggml-alloc : fix ggml_tallocr_is_own

* talk.wasm : update to new ggml

* ggml : fix type punning in ggml_scale

* ggml : cuda jetson + arm quants warnings
2023-12-22 17:53:39 +02:00
bobqianic d2419030b0
examples : Revert CMakeLists.txt for talk-llama (#1669) 2023-12-21 22:48:52 +02:00
Georgi Gerganov 940de9dbe9
wchess : update README.md 2023-12-14 22:00:47 +02:00
Georgi Gerganov 375585c07c
wchess : update readme 2023-12-14 17:51:14 +02:00
fraxy-v fd99ece8e3
wchess : whisper assisted chess (#1595)
* wchess: whisper assisted chess

* wchess: fix allowed moves in check

* wchess: touchstart, touchend events

* wchess: css, disabled button

* wchess : html touches

* wchess : minor fixes and code style

* wchess : bump encoder context to 1280

* wchess : index.html

* wchess : fix CI warnings

* wchess : add array header

* wchess : build static library

* wchess : display grammar

* wchess : update UX

* wchess : add comment

* wchess : add README

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-12-14 15:58:26 +02:00