llama.cpp/gguf-py
Kerfuffle dc07dc492e
convert : various script cleanups/fixes + merges and special token handling (#2842)
* convert: Fix permute calls and method/func definitions

* Cleanups for gguf-py

* Minor types cleanups.

* Initial implementation of handling merges and special tokens

* convert: Handle special tokens and merges in vocab only mode

convert: Vocab only mode no longer requires loading model tensors

* gguf: Refactor tensor name mapping

* convert: Fix type hint for special_token_types in SpecialVocab

* Use common special vocab handling in various conversion scripts

* First pass at implementing suggested changes

* Second pass

* gguf: SpecialVocab: Fix issue with special token content not in a dict

gguf: SpecialVocab: Allow skipping handling of merges

* convert-falcon-hf-to-gguf: Support --vocab-only option, bail out if no tokenizer.json

* convert-gptneox-hf-to-gguf and convert: Only handle merges for BPE tokenizer

* gguf: SpecialVocab: Actually set load_merges in object

* Uniform args parsing and vocab only mode for convert examples

* convert.py: Set gpt2 as tokenizer model when using BPE

* Squish last type warning in gguf.py - yay!
2023-08-30 11:25:50 +03:00
..
gguf convert : various script cleanups/fixes + merges and special token handling (#2842) 2023-08-30 11:25:50 +03:00
tests gguf : make gguf pip-installable 2023-08-25 09:26:05 +03:00
LICENSE gguf : make gguf pip-installable 2023-08-25 09:26:05 +03:00
pyproject.toml convert : various script cleanups/fixes + merges and special token handling (#2842) 2023-08-30 11:25:50 +03:00
README.md gguf : make gguf pip-installable 2023-08-25 09:26:05 +03:00

gguf

This is a Python package for writing binary files in the GGUF (GGML Universal File) format.

See convert-llama-hf-to-gguf.py as an example for its usage.

Installation

pip install gguf

Development

Maintainers who participate in development of this package are advised to install it in editable mode:

cd /path/to/llama.cpp/gguf-py

pip install --editable .

Note: This may require to upgrade your Pip installation, with a message saying that editable installation currently requires setup.py. In this case, upgrade Pip to the latest:

pip install --upgrade pip

Publishing

To publish the package, you need to have twine and build installed:

pip install build twine

Then, folow these steps to release a new version:

  1. Update the version in pyproject.toml.
  2. Build the package:
python -m build
  1. Upload the generated distribution archives:
python -m twine upload dist/*

TODO

  • Add tests
  • Include conversion scripts as command line entry points in this package.
  • Add CI workflow for releasing the package.