models : fix typo in convert-h5-to-ggml.py

signficant -> significant
2022-12-31 02:51:08 +09:00 · 2022-12-31 02:51:08 +09:00 · 3467230a77
parent a091581eb3
commit 3467230a77
1 changed files with 1 additions and 1 deletions
--- a/models/convert-h5-to-ggml.py
+++ b/models/convert-h5-to-ggml.py
@ -56,7 +56,7 @@ def bytes_to_unicode():
    The reversible bpe codes work on unicode strings.
    This means you need a large # of unicode characters in your vocab if you want to avoid UNKs.
    When you're at something like a 10B token dataset you end up needing around 5K for decent coverage.
-    This is a signficant percentage of your normal, say, 32K bpe vocab.
+    This is a significant percentage of your normal, say, 32K bpe vocab.
    To avoid that, we want lookup tables between utf-8 bytes and unicode strings.
    And avoids mapping to whitespace/control characters the bpe code barfs on.
    """