llama.cpp

History

Andrew Godfrey 947f64f163 finetune : zero the loraB initial vectors (#4082 ) * finetune : zero the loraB initial vectors Without this, the first iteration is starting out far from the base model, instead of exactly on it. Zeroing loraB is what the paper recommends. loralib also zeroes at least one of the init vector pairs (though it departs from the paper in using a different distribution for the other vector, in some cases). * tabs to spaces * Use ggml_set_zero instead of adding a new function		2023-11-17 11:23:11 +01:00
..
baby-llama	build : enable more non-default compiler warnings (#3200 )	2023-09-28 17:41:44 -04:00
batched	cuda : add batched cuBLAS GEMM for faster attention (#3749 )	2023-10-24 16:48:37 +03:00
batched-bench	Extend llama_kv_cache_seq_rm to allow matching any sequence (#3843 )	2023-10-29 11:31:40 -06:00
batched.swift	speculative : add tree-based sampling example (#3624 )	2023-10-18 16:21:57 +03:00
beam-search	llama : remove token functions with `context` args in favor of `model` (#3720 )	2023-10-23 22:40:03 +03:00
benchmark	sync : ggml (backend v2) (#3912 )	2023-11-13 14:16:23 +02:00
convert-llama2c-to-ggml	gguf : support big endian platform (#3552 )	2023-10-20 14:19:40 +03:00
embedding	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00
export-lora	sync : ggml (backend v2) (#3912 )	2023-11-13 14:16:23 +02:00
finetune	finetune : zero the loraB initial vectors (#4082 )	2023-11-17 11:23:11 +01:00
gguf	check C++ code with -Wmissing-declarations (#3184 )	2023-09-15 15:38:27 -04:00
infill	Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040 )	2023-11-16 19:14:37 -07:00
jeopardy	parallel : add option to load external prompt file (#3416 )	2023-10-06 16:16:38 +03:00
llama-bench	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00
llava	Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040 )	2023-11-16 19:14:37 -07:00
main	Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040 )	2023-11-16 19:14:37 -07:00
main-cmake-pkg	cmake : add missed dependencies (#3763 )	2023-10-24 20:48:45 +03:00
metal	sync : ggml (backend v2) (#3912 )	2023-11-13 14:16:23 +02:00
parallel	Fix some documentation typos/grammar mistakes (#4032 )	2023-11-11 23:04:58 -07:00
perplexity	Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040 )	2023-11-16 19:14:37 -07:00
quantize	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00
quantize-stats	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00
save-load-state	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00
server	Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040 )	2023-11-16 19:14:37 -07:00
simple	simple : fix batch handling (#3803 )	2023-10-27 08:37:41 -06:00
speculative	speculative : change default p_accept to 0.5 + CLI args (#3919 )	2023-11-03 09:41:56 +02:00
train-text-from-scratch	sync : ggml (backend v2) (#3912 )	2023-11-13 14:16:23 +02:00
alpaca.sh	alpaca.sh : update model file name (#2074 )	2023-07-06 19:17:50 +03:00
chat-13B.bat	Create chat-13B.bat (#592 )	2023-03-29 20:21:09 +03:00
chat-13B.sh	examples : read chat prompts from a template file (#1196 )	2023-05-03 20:58:11 +03:00
chat-persistent.sh	llama : fix session saving/loading (#3400 )	2023-10-03 21:04:01 +03:00
chat-vicuna.sh	examples : add chat-vicuna.sh (#1854 )	2023-06-15 21:05:53 +03:00
chat.sh	main : log file (#2748 )	2023-08-30 09:29:32 +03:00
CMakeLists.txt	sampling : refactor init to use llama_sampling_params (#3696 )	2023-10-20 21:07:23 +03:00
gpt4all.sh	examples : add -n to alpaca and gpt4all scripts (#706 )	2023-04-13 16:03:39 +03:00
json-schema-to-grammar.py	chmod : make scripts executable (#2675 )	2023-08-23 17:29:09 +03:00
llama.vim	vim : streaming and more (#2495 )	2023-08-08 14:44:48 +03:00
llama2-13b.sh	gitignore : changes for Poetry users + chat examples (#2284 )	2023-07-21 13:53:27 +03:00
llama2.sh	gitignore : changes for Poetry users + chat examples (#2284 )	2023-07-21 13:53:27 +03:00
llm.vim	llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879 )	2023-08-30 09:50:55 +03:00
make-ggml.py	make-ggml.py : compatibility with more models and GGUF (#3290 )	2023-09-27 19:25:12 +03:00
Miku.sh	MIKU MAYHEM: Upgrading the Default Model for Maximum Fun 🎉 (#2287 )	2023-07-21 11:13:18 +03:00
reason-act.sh	chmod : make scripts executable (#2675 )	2023-08-23 17:29:09 +03:00
server-llama2-13B.sh	chmod : make scripts executable (#2675 )	2023-08-23 17:29:09 +03:00