llama.cpp/examples/jeopardy
Jiří Podivín 5ddf7ea1fb
hooks : setting up flake8 and pre-commit hooks (#1681)
Small, non-functional changes were made to non-compliant files.
These include breaking up long lines, whitespace sanitation and
unused import removal.

Maximum line length in python files was set to a generous 125 chars,
in order to minimize number of changes needed in scripts and general
annoyance. The "txt" prompts directory is excluded from the checks
as it may contain oddly formatted files and strings for a good reason.

Signed-off-by: Jiri Podivin <jpodivin@gmail.com>
2023-06-17 13:32:48 +03:00
..
graph.py hooks : setting up flake8 and pre-commit hooks (#1681) 2023-06-17 13:32:48 +03:00
jeopardy.sh examples : add Jeopardy example (#1168) 2023-04-28 19:13:33 +03:00
qasheet.csv examples : add Jeopardy example (#1168) 2023-04-28 19:13:33 +03:00
questions.txt examples : add Jeopardy example (#1168) 2023-04-28 19:13:33 +03:00
README.md examples : add Jeopardy example (#1168) 2023-04-28 19:13:33 +03:00

llama.cpp/example/jeopardy

This is pretty much just a straight port of aigoopy/llm-jeopardy/ with an added graph viewer.

The jeopardy test can be used to compare the fact knowledge of different models and compare them to eachother. This is in contrast to some other tests, which test logical deduction, creativity, writing skills, etc.

Step 1: Open jeopardy.sh and modify the following:

MODEL=(path to your model)
MODEL_NAME=(name of your model)
prefix=(basically, if you use vicuna it's Human: , if you use something else it might be User: , etc)
opts=(add -instruct here if needed for your model, or anything else you want to test out)

Step 2: Run jeopardy.sh from the llama.cpp folder

Step 3: Repeat steps 1 and 2 until you have all the results you need.

Step 4: Run graph.py, and follow the instructions. At the end, it will generate your final graph.

Note: The Human bar is based off of the full, original 100 sample questions. If you modify the question count or questions, it will not be valid.