llama.cpp

History

Daniel Bevenius 23b5e12eb5 simple : update error message for KV cache check (#4324 ) This commit updates the error message that is printed when the KV cache is not big enough to hold all the prompt and generated tokens. Specifically it removes the reference to n_parallel and replaces it with n_len. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>		2023-12-04 18:04:21 +02:00
..
baby-llama	build : enable more non-default compiler warnings (#3200 )	2023-09-28 17:41:44 -04:00
batched	cuda : add batched cuBLAS GEMM for faster attention (#3749 )	2023-10-24 16:48:37 +03:00
batched-bench	ggml : add ggml_soft_max_ext (#4256 )	2023-12-01 10:51:24 +02:00
batched.swift	swift : fix prompt tokenization logic (#4321 )	2023-12-04 15:43:45 +02:00
beam-search	llama : remove token functions with `context` args in favor of `model` (#3720 )	2023-10-23 22:40:03 +03:00
benchmark	sync : ggml (backend v2) (#3912 )	2023-11-13 14:16:23 +02:00
convert-llama2c-to-ggml	gguf : support big endian platform (#3552 )	2023-10-20 14:19:40 +03:00
embedding	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00
export-lora	sync : ggml (backend v2) (#3912 )	2023-11-13 14:16:23 +02:00
finetune	finetune - update readme to mention llama support only (#4148 )	2023-11-20 19:30:00 +01:00
gguf	check C++ code with -Wmissing-declarations (#3184 )	2023-09-15 15:38:27 -04:00
infill	main : Add ChatML functionality to main example (#4046 )	2023-11-20 14:56:59 +01:00
jeopardy	parallel : add option to load external prompt file (#3416 )	2023-10-06 16:16:38 +03:00
llama-bench	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00
llama.swiftui	swift : fix concatenation method to avoid invalid UTF8 stringfication (#4325 )	2023-12-04 18:03:49 +02:00
llava	llava : ShareGPT4V compatibility (vision encoder only loading) (#4172 )	2023-11-30 23:11:14 +01:00
lookahead	examples : add readme files	2023-11-29 11:00:17 +02:00
main	main : pass LOG_TEE callback to llama.cpp log (#4033 )	2023-11-30 23:56:19 +02:00
main-cmake-pkg	cmake : add missed dependencies (#3763 )	2023-10-24 20:48:45 +03:00
metal	sync : ggml (backend v2) (#3912 )	2023-11-13 14:16:23 +02:00
parallel	llama : KV cache view API + better KV cache management (#4170 )	2023-11-23 19:07:56 +02:00
perplexity	Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040 )	2023-11-16 19:14:37 -07:00
quantize	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00
quantize-stats	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00
save-load-state	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00
server	server : fix OpenAI API `stop` field to be optional (#4299 )	2023-12-03 11:10:43 +02:00
simple	simple : update error message for KV cache check (#4324 )	2023-12-04 18:04:21 +02:00
speculative	examples : add readme files	2023-11-29 11:00:17 +02:00
tokenize	tokenize example: Respect normal add BOS token behavior (#4126 )	2023-11-18 14:48:17 -07:00
train-text-from-scratch	sync : ggml (backend v2) (#3912 )	2023-11-13 14:16:23 +02:00
alpaca.sh	alpaca.sh : update model file name (#2074 )	2023-07-06 19:17:50 +03:00
chat-13B.bat	Create chat-13B.bat (#592 )	2023-03-29 20:21:09 +03:00
chat-13B.sh	examples : read chat prompts from a template file (#1196 )	2023-05-03 20:58:11 +03:00
chat-persistent.sh	llama : fix session saving/loading (#3400 )	2023-10-03 21:04:01 +03:00
chat-vicuna.sh	examples : add chat-vicuna.sh (#1854 )	2023-06-15 21:05:53 +03:00
chat.sh	main : log file (#2748 )	2023-08-30 09:29:32 +03:00
CMakeLists.txt	lookahead : add example for lookahead decoding (#4207 )	2023-11-26 20:33:07 +02:00
gpt4all.sh	examples : add -n to alpaca and gpt4all scripts (#706 )	2023-04-13 16:03:39 +03:00
json-schema-to-grammar.py	chmod : make scripts executable (#2675 )	2023-08-23 17:29:09 +03:00
llama.vim	vim : streaming and more (#2495 )	2023-08-08 14:44:48 +03:00
llama2-13b.sh	gitignore : changes for Poetry users + chat examples (#2284 )	2023-07-21 13:53:27 +03:00
llama2.sh	gitignore : changes for Poetry users + chat examples (#2284 )	2023-07-21 13:53:27 +03:00
llm.vim	llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879 )	2023-08-30 09:50:55 +03:00
make-ggml.py	make-ggml.py : compatibility with more models and GGUF (#3290 )	2023-09-27 19:25:12 +03:00
Miku.sh	MIKU MAYHEM: Upgrading the Default Model for Maximum Fun 🎉 (#2287 )	2023-07-21 11:13:18 +03:00
reason-act.sh	chmod : make scripts executable (#2675 )	2023-08-23 17:29:09 +03:00
server-llama2-13B.sh	chmod : make scripts executable (#2675 )	2023-08-23 17:29:09 +03:00