llama.cpp/examples/quantize
Georgi Gerganov 574406dc7e
ggml : add Q5_0 and Q5_1 quantization (#1187)
* ggml : add Q5_0 quantization (cuBLAS only)

* ggml : fix Q5_0 qh -> uint32_t

* ggml : fix q5_0 histogram stats

* ggml : q5_0 scalar dot product

* ggml : q5_0 ARM NEON dot

* ggml : q5_0 more efficient ARM NEON using uint64_t masks

* ggml : rename Q5_0 -> Q5_1

* ggml : adding Q5_0 mode

* quantize : add Q5_0 and Q5_1 to map

* ggml : AVX2 optimizations for Q5_0, Q5_1 (#1195)

---------

Co-authored-by: Stephan Walter <stephan@walter.name>
2023-04-26 23:14:13 +03:00
..
CMakeLists.txt llama : fix linkage with mingw (#551) 2023-03-28 21:23:09 +03:00
quantize.cpp ggml : add Q5_0 and Q5_1 quantization (#1187) 2023-04-26 23:14:13 +03:00
README.md Overhaul the examples structure 2023-03-25 20:26:40 +02:00

quantize

TODO