Commit graph

17 commits

Author SHA1 Message Date
Volodymyr Vitvitskyi f305bad11e
flake : build llama.cpp on Intel with nix (#2795)
Problem
-------
`nix build` fails with missing `Accelerate.h`.

Changes
-------
- Fix build of the llama.cpp with nix for Intel: add the same SDK frameworks as
for ARM
- Add `quantize` app to the output of nix flake
- Extend nix devShell with llama-python so we can use convertScript

Testing
-------
Testing the steps with nix:
1. `nix build`
Get the model and then
2. `nix develop` and then `python convert.py models/llama-2-7b.ggmlv3.q4_0.bin`
3. `nix run llama.cpp#quantize -- open_llama_7b/ggml-model-f16.gguf ./models/ggml-model-q4_0.bin 2`
4. `nix run llama.cpp#llama -- -m models/ggml-model-q4_0.bin -p "What is nix?" -n 400 --temp 0.8 -e -t 8`

Co-authored-by: Volodymyr Vitvitskyi <volodymyrvitvitskyi@SamsungPro.local>
2023-08-26 16:25:39 +03:00
Shouzheng Liu bf83bff674
metal : matrix-matrix multiplication kernel (#2615)
* metal: matrix-matrix multiplication kernel

This commit removes MPS and uses custom matrix-matrix multiplication
kernels for all quantization types. This commit also adds grouped-query
attention to support llama2 70B.

* metal: fix performance degradation from gqa

Integers are slow on the GPU, and 64-bit divides are extremely slow.
In the context of GQA, we introduce a 64-bit divide that cannot be
optimized out by the compiler, which results in a decrease of ~8% in
inference performance. This commit fixes that issue by calculating a
part of the offset with a 32-bit divide. Naturally, this limits the
size of a single matrix to ~4GB. However, this limitation should
suffice for the near future.

* metal: fix bugs for GQA and perplexity test.

I mixed up ne02 and nb02 in previous commit.
2023-08-16 23:07:04 +03:00
wzy bc3ec2cdc9
flake : support nix build '.#opencl' (#2337) 2023-07-23 14:57:02 +03:00
wzy 78a3d13424
flake : remove intel mkl from flake.nix due to missing files (#2277)
NixOS's mkl misses some libraries like mkl-sdl.pc. See #2261
Currently NixOS doesn't have intel C compiler (icx, icpx). See https://discourse.nixos.org/t/packaging-intel-math-kernel-libraries-mkl/975
So remove it from flake.nix

Some minor changes:

- Change pkgs.python310 to pkgs.python3 to keep latest
- Add pkgconfig to devShells.default
- Remove installPhase because we have `cmake --install` from #2256
2023-07-21 13:26:34 +03:00
wzy 45a1b07e9b
flake : update flake.nix (#2270)
When `isx86_32 || isx86_64`, it will use mkl, else openblas

According to
https://discourse.nixos.org/t/rpath-of-binary-contains-a-forbidden-reference-to-build/12200/3,
add -DCMAKE_SKIP_BUILD_RPATH=ON

Fix #2261, Nix doesn't provide mkl-sdl.pc.
When we build with -DBUILD_SHARED_LIBS=ON, -DLLAMA_BLAS_VENDOR=Intel10_lp64
replace mkl-sdl.pc by mkl-dynamic-lp64-iomp.pc
2023-07-19 10:01:55 +03:00
Dave Della Costa a6803cab94
flake : add runHook preInstall/postInstall to installPhase so hooks function (#2224) 2023-07-14 22:13:38 +03:00
Rowan Hart fdd1860911
flake : fix ggml-metal.metal path and run nixfmt (#1974) 2023-06-24 14:07:08 +03:00
Faez Shakil fc45a81bc6
exposed modules so that they can be invoked by nix run github:ggerganov/llama.cpp#server etc (#1863) 2023-06-17 14:13:05 +02:00
Andrei 303f5809f1
metal : fix issue with ggml-metal.metal path. Closes #1769 (#1782)
* Fix issue with ggml-metal.metal path

* Add ggml-metal.metal as a resource for llama target

* Update flake.nix metal kernel substitution
2023-06-10 17:47:34 +03:00
jacobi petrucciani 5b57a5b726
flake : update to support metal on m1/m2 (#1724) 2023-06-07 07:15:31 +03:00
Pavol Rusnak bb98e77be7
nix: use convert.py instead of legacy wrapper convert-pth-to-ggml.py (#981) 2023-04-25 23:19:57 +02:00
Pavol Rusnak a32f7acc9f
py : cleanup dependencies (#962)
after #545 we do not need torch, tqdm and requests in the dependencies
2023-04-14 15:37:11 +02:00
Pavol Rusnak c729ff730a
flake.nix: add all binaries from bin (#848) 2023-04-13 15:49:05 +02:00
lon 317fb12fbd
Add new binaries to flake.nix (#847) 2023-04-08 12:04:23 +02:00
Ben Siraphob a18c19259a Fix Nix build 2023-03-23 17:51:26 +01:00
Ben Siraphob bd4b46d6ba Nix flake: set meta.mainProgram to llama 2023-03-20 22:50:22 +01:00
Niklas Korz a292747893
Nix flake (#40)
* Nix flake

* Nix: only add Accelerate framework on macOS

* Nix: development shel, direnv and compatibility

* Nix: use python packages supplied by withPackages

* Nix: remove channel compatibility

* Nix: fix ARM neon dotproduct on macOS

---------

Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-03-17 23:03:48 +01:00