llama.cpp/common
slaren 1123f7fbdf
ggml-cuda : use graph allocator (#2684)
use a different function for no_alloc to avoid breaking backwards compat, fixes lora

remove 512 n_batch limit

fixed 2048 batch size

cleanup

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2023-08-22 15:25:19 +02:00
..
CMakeLists.txt gguf : new file format with flexible meta data (beta) (#2398) 2023-08-21 23:07:43 +03:00
common.cpp ggml-cuda : use graph allocator (#2684) 2023-08-22 15:25:19 +02:00
common.h gguf : new file format with flexible meta data (beta) (#2398) 2023-08-21 23:07:43 +03:00
console.cpp gguf : new file format with flexible meta data (beta) (#2398) 2023-08-21 23:07:43 +03:00
console.h gguf : new file format with flexible meta data (beta) (#2398) 2023-08-21 23:07:43 +03:00
grammar-parser.cpp gguf : new file format with flexible meta data (beta) (#2398) 2023-08-21 23:07:43 +03:00
grammar-parser.h gguf : new file format with flexible meta data (beta) (#2398) 2023-08-21 23:07:43 +03:00