llama.cpp/examples
Johannes Gäßler 254a7a7a5f
CUDA full GPU acceleration, KV cache in VRAM (#1827)
* Fixed CUDA RoPE

* ggml_cuda_mul_mat_vec_p021

* ggml_cuda_scale

* ggml_cuda_diag_mask_inf

* ggml_is_permuted

* ggml_cuda_cpy

* flatten rows for ggml_cuda_op

* Added a --low-vram option

* Fixed Windows performance

* Fixed LLAMA_CUDA_DMMV_Y > 1 for WizardLM
2023-06-14 19:47:19 +02:00
..
baby-llama baby-llama : fix operator!= (#1821) 2023-06-13 22:37:54 +03:00
benchmark llama : add llama_init_backend() API (close #1527) 2023-05-20 11:06:37 +03:00
embedding llama : add llama_init_backend() API (close #1527) 2023-05-20 11:06:37 +03:00
jeopardy examples : add Jeopardy example (#1168) 2023-04-28 19:13:33 +03:00
main CUDA full GPU acceleration, KV cache in VRAM (#1827) 2023-06-14 19:47:19 +02:00
metal llama : Metal inference (#1642) 2023-06-04 23:34:30 +03:00
perplexity llama : add llama_init_backend() API (close #1527) 2023-05-20 11:06:37 +03:00
quantize Allow "quantizing" to f16 and f32 (#1787) 2023-06-13 04:23:23 -06:00
quantize-stats ggml : add SOTA 2,3,4,5,6 bit k-quantizations (#1684) 2023-06-05 22:56:18 +03:00
save-load-state Remove unused n_parts parameter (#1509) 2023-05-17 22:12:01 +00:00
server CUDA full GPU acceleration, KV cache in VRAM (#1827) 2023-06-14 19:47:19 +02:00
train-text-from-scratch train : improved training-from-scratch example (#1652) 2023-06-13 22:04:40 +03:00
alpaca.sh examples : Improve Alpaca Default Repeat Penalty: Better Match Alpaca.cpp Experience (#1107) 2023-04-22 09:54:33 +03:00
chat-13B.bat Create chat-13B.bat (#592) 2023-03-29 20:21:09 +03:00
chat-13B.sh examples : read chat prompts from a template file (#1196) 2023-05-03 20:58:11 +03:00
chat-persistent.sh chat-persistent.sh : use bracket expressions in grep (#1564) 2023-05-24 09:16:22 +03:00
chat.sh If n_predict == -1, generate forever 2023-03-25 21:51:41 +02:00
CMakeLists.txt train : improved training-from-scratch example (#1652) 2023-06-13 22:04:40 +03:00
common.cpp CUDA full GPU acceleration, KV cache in VRAM (#1827) 2023-06-14 19:47:19 +02:00
common.h CUDA full GPU acceleration, KV cache in VRAM (#1827) 2023-06-14 19:47:19 +02:00
gpt4all.sh examples : add -n to alpaca and gpt4all scripts (#706) 2023-04-13 16:03:39 +03:00
Miku.sh examples : various prompt and example fixes (#1298) 2023-05-03 18:26:47 +03:00
reason-act.sh add example of re-act pattern (#583) 2023-03-29 10:10:24 -05:00