Merge pull request #78 from jokkebk/Specify-utf8-for-vocab.json

Add enconding parameter to vocab.json opening to fix errors
pull/81/head
Georgi Gerganov 2022-10-23 12:23:04 +03:00 committed by GitHub
commit 3d37ad5133
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 1 additions and 1 deletions

View File

@ -234,7 +234,7 @@ dir_tokenizer = tokenizer.name_or_path
# output in the same directory as the model
fname_out = dir_out + "/ggml-model.bin"
with open(dir_tokenizer + "/vocab.json", "r") as f:
with open(dir_tokenizer + "/vocab.json", "r", encoding="utf8") as f:
tokens = json.load(f)
# use 16-bit or 32-bit floats