* working but ugly
* add arg flag, not working on embedding mode
* typo
* Working! Thanks to @nullhook
* make params argument instead of hardcoded boolean. remove useless time check
* start doing the instructions but not finished. This probably doesnt compile
* Embeddings extraction support
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Major refactoring - introduce C-style API
* Clean up
* Add <cassert>
* Add <iterator>
* Add <algorithm> ....
* Fix timing reporting and accumulation
* Measure eval time only for single-token calls
* Change llama_tokenize return meaning