llama.cpp

History

Xiao-Yong Jin b8ad1b66b2 server : allow json array in prompt or content for direct token input (#2306 ) * server: allow json array in prompt or content We accept an array of strings and numbers representing tokens, in addition to the current string valued prompt or content. This allows direct token input, so that any special tokens can be processed and used at the frontend during the construction of the json data, before sending to the server. And the server does not need to know or parse special tokens from textual input. With this, we can use EOS and BOS used in llama-2-chat models. * server: use tokenizePrompt(json) and default "" if empty prompt * server: fix prompt check * server: tokenize endpoint no longer adds BOS		2023-08-23 15:12:12 +08:00
..
baby-llama	Add LLAMA_DEFAULT_RMS_EPS so we can change the default (#2384 )	2023-07-25 18:35:53 +03:00
benchmark	cmake : install targets (#2256 )	2023-07-19 10:01:11 +03:00
convert-llama2c-to-ggml	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
embd-input	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
embedding	embedding : evaluate prompt in batches (#2713 )	2023-08-22 16:03:12 +02:00
gguf	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
gptneox-wip	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
jeopardy	hooks : setting up flake8 and pre-commit hooks (#1681 )	2023-06-17 13:32:48 +03:00
llama-bench	llama-bench : minor fixes (#2695 )	2023-08-22 10:56:03 +03:00
main	docs : add grammar docs (#2701 )	2023-08-22 21:01:57 -04:00
metal	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
perplexity	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
quantize	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
quantize-stats	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
save-load-state	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
server	server : allow json array in prompt or content for direct token input (#2306 )	2023-08-23 15:12:12 +08:00
simple	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
train-text-from-scratch	ggml : sync latest (SAM + SD operators, CUDA alibi) (#2709 )	2023-08-22 14:22:08 +03:00
alpaca.sh	alpaca.sh : update model file name (#2074 )	2023-07-06 19:17:50 +03:00
chat-13B.bat	Create chat-13B.bat (#592 )	2023-03-29 20:21:09 +03:00
chat-13B.sh	examples : read chat prompts from a template file (#1196 )	2023-05-03 20:58:11 +03:00
chat-persistent.sh	chat-persistent.sh : use bracket expressions in grep (#1564 )	2023-05-24 09:16:22 +03:00
chat-vicuna.sh	examples : add chat-vicuna.sh (#1854 )	2023-06-15 21:05:53 +03:00
chat.sh	If n_predict == -1, generate forever	2023-03-25 21:51:41 +02:00
CMakeLists.txt	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
gpt4all.sh	examples : add -n to alpaca and gpt4all scripts (#706 )	2023-04-13 16:03:39 +03:00
json-schema-to-grammar.py	examples : generate JSON according to schema (#1887 )	2023-08-02 22:05:44 -04:00
llama.vim	vim : streaming and more (#2495 )	2023-08-08 14:44:48 +03:00
llama2-13b.sh	gitignore : changes for Poetry users + chat examples (#2284 )	2023-07-21 13:53:27 +03:00
llama2.sh	gitignore : changes for Poetry users + chat examples (#2284 )	2023-07-21 13:53:27 +03:00
llm.vim	llm.vim : multiline autocompletion, get rid of "^@" (#2543 )	2023-08-08 15:07:02 +03:00
make-ggml.py	examples : add easy python script to create quantized (k-bit support) GGML models from local HF Transformer models (#2311 )	2023-07-21 22:01:10 +03:00
Miku.sh	MIKU MAYHEM: Upgrading the Default Model for Maximum Fun 🎉 (#2287 )	2023-07-21 11:13:18 +03:00
reason-act.sh	add example of re-act pattern (#583 )	2023-03-29 10:10:24 -05:00
server-llama2-13B.sh	examples : fix whitespace	2023-07-28 21:05:08 +03:00