llama.cpp/examples
Georgi Gerganov 77a73403ca
ggml : add new Q4_2 quantization (ARM only) (#1046)
* ggml : Q4_2 ARM

* ggml : add ggml_is_quantized()

* llama : update llama_type_name() with Q4_2 entry

* ggml : speed-up q4_2

- 4 threads: ~100ms -> ~90ms
- 8 threads:  ~55ms -> ~50ms

* ggml : optimize q4_2 using vmlaq_n_f32 + vmulq_n_f32
2023-04-18 23:54:57 +03:00
..
benchmark benchmark : fix result validation in benchmark-q4_0-matmult (#987) 2023-04-15 08:51:54 +03:00
embedding examples: add missing <ctime> include for time() (#1011) 2023-04-16 10:13:00 +00:00
main Add LoRA support (#820) 2023-04-17 17:28:55 +02:00
perplexity Add LoRA support (#820) 2023-04-17 17:28:55 +02:00
quantize ggml : add new Q4_2 quantization (ARM only) (#1046) 2023-04-18 23:54:57 +03:00
quantize-stats quantize-stats : fix bug in --type argument 2023-04-17 17:31:06 +03:00
alpaca.sh examples : add -n to alpaca and gpt4all scripts (#706) 2023-04-13 16:03:39 +03:00
chat-13B.bat Create chat-13B.bat (#592) 2023-03-29 20:21:09 +03:00
chat-13B.sh Move chat scripts into "./examples" 2023-03-25 20:37:09 +02:00
chat.sh If n_predict == -1, generate forever 2023-03-25 21:51:41 +02:00
CMakeLists.txt Add quantize-stats command for testing quantization (#728) 2023-04-08 00:09:18 +02:00
common.cpp Add LoRA support (#820) 2023-04-17 17:28:55 +02:00
common.h Add LoRA support (#820) 2023-04-17 17:28:55 +02:00
gpt4all.sh examples : add -n to alpaca and gpt4all scripts (#706) 2023-04-13 16:03:39 +03:00
Miku.sh Fix whitespace, add .editorconfig, add GitHub workflow (#883) 2023-04-11 19:45:44 +00:00
reason-act.sh add example of re-act pattern (#583) 2023-03-29 10:10:24 -05:00