llama.cpp/examples
Kerfuffle 4f0154b0ba
llama : support requantizing models instead of only allowing quantization from 16/32bit (#1691)
* Add support for quantizing already quantized models

* Threaded dequantizing and f16 to f32 conversion

* Clean up thread blocks with spares calculation a bit

* Use std::runtime_error exceptions.
2023-06-10 10:59:17 +03:00
..
baby-llama ggml : implement backward pass for llama + small training-llama-from-scratch example (#1360) 2023-05-13 15:56:40 +03:00
benchmark llama : add llama_init_backend() API (close #1527) 2023-05-20 11:06:37 +03:00
embedding llama : add llama_init_backend() API (close #1527) 2023-05-20 11:06:37 +03:00
jeopardy examples : add Jeopardy example (#1168) 2023-04-28 19:13:33 +03:00
main main: add the possibility to open the prompt cache read-only (#1640) 2023-06-06 22:10:17 -04:00
metal llama : Metal inference (#1642) 2023-06-04 23:34:30 +03:00
perplexity llama : add llama_init_backend() API (close #1527) 2023-05-20 11:06:37 +03:00
quantize llama : support requantizing models instead of only allowing quantization from 16/32bit (#1691) 2023-06-10 10:59:17 +03:00
quantize-stats ggml : add SOTA 2,3,4,5,6 bit k-quantizations (#1684) 2023-06-05 22:56:18 +03:00
save-load-state Remove unused n_parts parameter (#1509) 2023-05-17 22:12:01 +00:00
server Multi GPU support, CUDA refactor, CUDA scratch buffer (#1703) 2023-06-06 21:33:23 +02:00
alpaca.sh examples : Improve Alpaca Default Repeat Penalty: Better Match Alpaca.cpp Experience (#1107) 2023-04-22 09:54:33 +03:00
chat-13B.bat Create chat-13B.bat (#592) 2023-03-29 20:21:09 +03:00
chat-13B.sh examples : read chat prompts from a template file (#1196) 2023-05-03 20:58:11 +03:00
chat-persistent.sh chat-persistent.sh : use bracket expressions in grep (#1564) 2023-05-24 09:16:22 +03:00
chat.sh If n_predict == -1, generate forever 2023-03-25 21:51:41 +02:00
CMakeLists.txt llama : Metal inference (#1642) 2023-06-04 23:34:30 +03:00
common.cpp main: add the possibility to open the prompt cache read-only (#1640) 2023-06-06 22:10:17 -04:00
common.h main: add the possibility to open the prompt cache read-only (#1640) 2023-06-06 22:10:17 -04:00
gpt4all.sh examples : add -n to alpaca and gpt4all scripts (#706) 2023-04-13 16:03:39 +03:00
Miku.sh examples : various prompt and example fixes (#1298) 2023-05-03 18:26:47 +03:00
reason-act.sh add example of re-act pattern (#583) 2023-03-29 10:10:24 -05:00