llama.cpp/common
staviq 1a159553f9
tokenizer : special token handling (#3538)
* Rewrite special token handling from #1931

* shorten param name, add st verification by type

* use offsets instead of copy by substr

* formatting, remove copying iterator on delete

* llama : normalize code-style

* swift fix

* print pfx/sfx if verb, main: split pfx input sfx

* dont add space when using special tokens

* minor : comment + spacing

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-17 18:11:01 +03:00
..
CMakeLists.txt common : fix mirostat state when using multiple sequences (#3543) 2023-10-11 22:35:46 +03:00
common.cpp tokenizer : special token handling (#3538) 2023-10-17 18:11:01 +03:00
common.h tokenizer : special token handling (#3538) 2023-10-17 18:11:01 +03:00
console.cpp check C++ code with -Wmissing-declarations (#3184) 2023-09-15 15:38:27 -04:00
console.h gguf : new file format with flexible meta data (beta) (#2398) 2023-08-21 23:07:43 +03:00
grammar-parser.cpp check C++ code with -Wmissing-declarations (#3184) 2023-09-15 15:38:27 -04:00
grammar-parser.h gguf : new file format with flexible meta data (beta) (#2398) 2023-08-21 23:07:43 +03:00
log.h build : enable more non-default compiler warnings (#3200) 2023-09-28 17:41:44 -04:00
sampling.cpp common : fix mirostat state when using multiple sequences (#3543) 2023-10-11 22:35:46 +03:00
sampling.h common : fix mirostat state when using multiple sequences (#3543) 2023-10-11 22:35:46 +03:00
stb_image.h examples: support LLaVA v1.5 (multimodal model) (#3436) 2023-10-12 18:23:18 +03:00
train.cpp tokenizer : special token handling (#3538) 2023-10-17 18:11:01 +03:00
train.h train : finetune LORA (#2632) 2023-09-28 21:40:11 +03:00