whisper.cpp/examples/server/README.md

# whisper.cpp http server

Simple http server. WAV Files are passed to the inference model via http requests.

https://github.com/ggerganov/whisper.cpp/assets/1991296/e983ee53-8741-4eb5-9048-afe5e4594b8f

## Usage

```
./server -h

usage: ./bin/server [options]

options:
  -h,        --help              [default] show this help message and exit
  -t N,      --threads N         [4      ] number of threads to use during computation
  -p N,      --processors N      [1      ] number of processors to use during computation
  -ot N,     --offset-t N        [0      ] time offset in milliseconds
  -on N,     --offset-n N        [0      ] segment index offset
  -d  N,     --duration N        [0      ] duration of audio to process in milliseconds
  -mc N,     --max-context N     [-1     ] maximum number of text context tokens to store
  -ml N,     --max-len N         [0      ] maximum segment length in characters
  -sow,      --split-on-word     [false  ] split on word rather than on token
  -bo N,     --best-of N         [2      ] number of best candidates to keep
  -bs N,     --beam-size N       [-1     ] beam size for beam search
  -wt N,     --word-thold N      [0.01   ] word timestamp probability threshold
  -et N,     --entropy-thold N   [2.40   ] entropy threshold for decoder fail
  -lpt N,    --logprob-thold N   [-1.00  ] log probability threshold for decoder fail
  -debug,    --debug-mode        [false  ] enable debug mode (eg. dump log_mel)
  -tr,       --translate         [false  ] translate from source language to english
  -di,       --diarize           [false  ] stereo audio diarization
  -tdrz,     --tinydiarize       [false  ] enable tinydiarize (requires a tdrz model)
  -nf,       --no-fallback       [false  ] do not use temperature fallback while decoding
  -ps,       --print-special     [false  ] print special tokens
  -pc,       --print-colors      [false  ] print colors
  -pr,       --print-realtime    [false  ] print output in realtime
  -pp,       --print-progress    [false  ] print progress
  -nt,       --no-timestamps     [false  ] do not print timestamps
  -l LANG,   --language LANG     [en     ] spoken language ('auto' for auto-detect)
  -dl,       --detect-language   [false  ] exit after automatically detecting language
             --prompt PROMPT     [       ] initial prompt
  -m FNAME,  --model FNAME       [models/ggml-base.en.bin] model path
  -oved D,   --ov-e-device DNAME [CPU    ] the OpenVINO device used for encode inference
  --host HOST,                   [127.0.0.1] Hostname/ip-adress for the server
  --port PORT,                   [8080   ] Port number for the server
  --convert,                     [false  ] Convert audio to WAV, requires ffmpeg on the server
```

> [!WARNING]
> **Do not run the server example with administrative privileges and ensure it's operated in a sandbox environment, especially since it involves risky operations like accepting user file uploads and using ffmpeg for format conversions. Always validate and sanitize inputs to guard against potential security threats.**

## request examples

**/inference**
```
curl 127.0.0.1:8080/inference \
-H "Content-Type: multipart/form-data" \
-F file="@<file-path>" \
-F temperature="0.0" \
-F temperature_inc="0.2" \
-F response_format="json"
```

**/load**
```
curl 127.0.0.1:8080/load \
-H "Content-Type: multipart/form-data" \
-F model="<path-to-model-file>"
```
server : add a REST Whisper server example with OAI-like API (#1380) * Add first draft of server * Added json support and base funcs for server.cpp * Add more user input via api-request also some clean up * Add reqest params and load post function Also some general clean up * Remove unused function * Add readme * Add exception handlers * Update examples/server/server.cpp * make : add server target * Add magic curl syntax Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 2023-11-20 20:40:24 +01:00			`# whisper.cpp http server`

			`Simple http server. WAV Files are passed to the inference model via http requests.`

server : add video to readme 2023-11-21 16:30:43 +01:00			`https://github.com/ggerganov/whisper.cpp/assets/1991296/e983ee53-8741-4eb5-9048-afe5e4594b8f`

			`## Usage`

server : add a REST Whisper server example with OAI-like API (#1380) * Add first draft of server * Added json support and base funcs for server.cpp * Add more user input via api-request also some clean up * Add reqest params and load post function Also some general clean up * Remove unused function * Add readme * Add exception handlers * Update examples/server/server.cpp * make : add server target * Add magic curl syntax Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 2023-11-20 20:40:24 +01:00			```
			`./server -h`

			`usage: ./bin/server [options]`

			`options:`
			`-h, --help [default] show this help message and exit`
			`-t N, --threads N [4 ] number of threads to use during computation`
			`-p N, --processors N [1 ] number of processors to use during computation`
			`-ot N, --offset-t N [0 ] time offset in milliseconds`
			`-on N, --offset-n N [0 ] segment index offset`
			`-d N, --duration N [0 ] duration of audio to process in milliseconds`
			`-mc N, --max-context N [-1 ] maximum number of text context tokens to store`
			`-ml N, --max-len N [0 ] maximum segment length in characters`
			`-sow, --split-on-word [false ] split on word rather than on token`
			`-bo N, --best-of N [2 ] number of best candidates to keep`
			`-bs N, --beam-size N [-1 ] beam size for beam search`
			`-wt N, --word-thold N [0.01 ] word timestamp probability threshold`
			`-et N, --entropy-thold N [2.40 ] entropy threshold for decoder fail`
			`-lpt N, --logprob-thold N [-1.00 ] log probability threshold for decoder fail`
			`-debug, --debug-mode [false ] enable debug mode (eg. dump log_mel)`
			`-tr, --translate [false ] translate from source language to english`
			`-di, --diarize [false ] stereo audio diarization`
			`-tdrz, --tinydiarize [false ] enable tinydiarize (requires a tdrz model)`
			`-nf, --no-fallback [false ] do not use temperature fallback while decoding`
			`-ps, --print-special [false ] print special tokens`
			`-pc, --print-colors [false ] print colors`
server : add --print-realtime param (#1541) * server : add --print-realtime param * Fix duplicate realtime output 2023-11-24 08:35:02 +01:00			`-pr, --print-realtime [false ] print output in realtime`
server : add a REST Whisper server example with OAI-like API (#1380) * Add first draft of server * Added json support and base funcs for server.cpp * Add more user input via api-request also some clean up * Add reqest params and load post function Also some general clean up * Remove unused function * Add readme * Add exception handlers * Update examples/server/server.cpp * make : add server target * Add magic curl syntax Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 2023-11-20 20:40:24 +01:00			`-pp, --print-progress [false ] print progress`
			`-nt, --no-timestamps [false ] do not print timestamps`
			`-l LANG, --language LANG [en ] spoken language ('auto' for auto-detect)`
			`-dl, --detect-language [false ] exit after automatically detecting language`
			`--prompt PROMPT [ ] initial prompt`
			`-m FNAME, --model FNAME [models/ggml-base.en.bin] model path`
			`-oved D, --ov-e-device DNAME [CPU ] the OpenVINO device used for encode inference`
			`--host HOST, [127.0.0.1] Hostname/ip-adress for the server`
			`--port PORT, [8080 ] Port number for the server`
server : automatically convert audio on the server (#1539) * server : automatically convert audio on the server * server : remove rebundant comments * server : automatic conversion refactor * server : update server readme * server : remove unnecessary comments and tabs * server : put back remove calling * server : apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * server : check ffmpeg before the server lunch * server : fix indentation * Apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * server : fix function typo calling * server : fix function typo calling * server : add warning in readme --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 2023-11-27 10:28:34 +01:00			`--convert, [false ] Convert audio to WAV, requires ffmpeg on the server`
server : add a REST Whisper server example with OAI-like API (#1380) * Add first draft of server * Added json support and base funcs for server.cpp * Add more user input via api-request also some clean up * Add reqest params and load post function Also some general clean up * Remove unused function * Add readme * Add exception handlers * Update examples/server/server.cpp * make : add server target * Add magic curl syntax Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 2023-11-20 20:40:24 +01:00			```

server : fix server temperature + add temperature_inc (#1729) * server : fix server temperature + add temperature_inc * server : change dashes to underscores in parameter names 2024-01-07 12:35:14 +01:00			`> [!WARNING]`
server : automatically convert audio on the server (#1539) * server : automatically convert audio on the server * server : remove rebundant comments * server : automatic conversion refactor * server : update server readme * server : remove unnecessary comments and tabs * server : put back remove calling * server : apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * server : check ffmpeg before the server lunch * server : fix indentation * Apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * server : fix function typo calling * server : fix function typo calling * server : add warning in readme --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 2023-11-27 10:28:34 +01:00			`> Do not run the server example with administrative privileges and ensure it's operated in a sandbox environment, especially since it involves risky operations like accepting user file uploads and using ffmpeg for format conversions. Always validate and sanitize inputs to guard against potential security threats.`

server : add a REST Whisper server example with OAI-like API (#1380) * Add first draft of server * Added json support and base funcs for server.cpp * Add more user input via api-request also some clean up * Add reqest params and load post function Also some general clean up * Remove unused function * Add readme * Add exception handlers * Update examples/server/server.cpp * make : add server target * Add magic curl syntax Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 2023-11-20 20:40:24 +01:00			`## request examples`

			`/inference`
			```
			`curl 127.0.0.1:8080/inference \`
			`-H "Content-Type: multipart/form-data" \`
			`-F file="@<file-path>" \`
server : fix server temperature + add temperature_inc (#1729) * server : fix server temperature + add temperature_inc * server : change dashes to underscores in parameter names 2024-01-07 12:35:14 +01:00			`-F temperature="0.0" \`
			`-F temperature_inc="0.2" \`
			`-F response_format="json"`
server : add a REST Whisper server example with OAI-like API (#1380) * Add first draft of server * Added json support and base funcs for server.cpp * Add more user input via api-request also some clean up * Add reqest params and load post function Also some general clean up * Remove unused function * Add readme * Add exception handlers * Update examples/server/server.cpp * make : add server target * Add magic curl syntax Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 2023-11-20 20:40:24 +01:00			```

			`/load`
			```
			`curl 127.0.0.1:8080/load \`
			`-H "Content-Type: multipart/form-data" \`
			`-F model="<path-to-model-file>"`
			```