llama.cpp

Author	SHA1	Message	Date
Ronsor	956dfda8ad	Use `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py (#142 ) There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.	2023-03-15 21:37:50 +02:00
Val Kharitonov	2a20f48efa	Fix UTF-8 handling (including colors) (#79 )	2023-03-13 18:24:18 +02:00
Georgi Gerganov	7c9e54e55e	Revert "weights_only" arg - this causing more trouble than help	2023-03-12 20:59:01 +02:00
Oleksandr Nikitin	b9bd1d0141	python/pytorch compat notes (#44 )	2023-03-12 14:16:33 +02:00
deepdiffuser	a93120236f	use weights_only in conversion script (#32 ) this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries	2023-03-12 08:36:35 +02:00
Georgi Gerganov	007a8f6f45	Support all LLaMA models + change Q4_0 quantization storage	2023-03-11 11:28:30 +02:00
Georgi Gerganov	70bc0b8b15	Fix a bug in the rope calculation	2023-03-10 23:46:57 +02:00
Georgi Gerganov	26c0846629	Initial release	2023-03-10 20:56:40 +02:00