Commit Graph

  • 462dc69f7b Code for bidirectional async GRPC streaming transcription example Shane Lenagh 2024-02-03 21:31:46 -0600
  • 3da4f72feb Makefile : add MACOSX_DEPLOYMENT_TARGET option Didzis Gosko 2024-02-03 20:34:55 +0200
  • 962a540455 ggml : embed Metal library source (ggml-metal.metal) into binary Didzis Gosko 2024-02-03 20:19:15 +0200
  • 670d9202ca ggml : introduce runtime selection of extended instruction sets for x86 architecture Didzis Gosko 2024-02-03 11:56:46 +0200
  • ad14ed0a4a
    Update CMakeLists.txt bobqianic 2024-02-03 00:37:03 +0000
  • a308108a6e
    Update CMakeLists.txt bobqianic 2024-02-03 00:36:34 +0000
  • 64d54b4c39
    Update examples/stream-stdin/CMakeLists.txt bobqianic 2024-02-03 00:22:41 +0000
  • fff48c02f7
    Update examples/CMakeLists.txt bobqianic 2024-02-03 00:22:37 +0000
  • e2e5177498
    Merge pull request #4 from bobqianic/update bobqianic 2024-02-02 22:03:56 +0000
  • 67667476fa
    Add files via upload bobqianic 2024-02-02 22:02:56 +0000
  • 5ef0ea2127
    Add files via upload bobqianic 2024-02-02 22:01:51 +0000
  • 850fa2fdaa
    Add files via upload bobqianic 2024-02-02 22:00:44 +0000
  • 1a5cf60555 Fix a bad command in the README.md Alex Young 2024-02-01 21:18:18 +0000
  • aa9370f946 Very rough cut of streaming from stdin. Alex Young 2024-02-01 20:58:56 +0000
  • 6a2674c29f
    Fix potential heap-buffer-overflow bobqianic 2024-02-01 20:48:25 +0000
  • 98e9c69d00
    Fix audio feature seeking error bobqianic 2024-02-01 16:34:08 +0000
  • db49f1b9f6
    Merge pull request #3 from bobqianic/restore_best_of bobqianic 2024-01-31 23:10:12 +0000
  • 7592693baf
    Update whisper.cpp bobqianic 2024-01-31 22:57:14 +0000
  • 6e3b7d462c
    Restore WhisperCppTest.java bobqianic 2024-01-31 22:55:25 +0000
  • 4b079bf745
    Merge branch 'master' into fix-decoding bobqianic 2024-01-31 20:55:32 +0000
  • 7a74e929c8
    sync : ggml (#0) Georgi Gerganov 2024-01-30 21:30:26 +0200
  • 361ecebe90
    ggml : fix IQ3_XXS on Metal (llama/5219) Kawrakow 2024-01-30 19:15:28 +0200
  • 807cbc672e
    sync : ggml (llama/0) Georgi Gerganov 2024-01-30 16:21:57 +0200
  • 98ae5276b7
    Faster AVX2 dot product for IQ2_XS (llama/5187) Kawrakow 2024-01-30 15:15:07 +0200
  • 6adb969b09
    SOTA 3-bit quants (llama/5196) Kawrakow 2024-01-30 15:14:12 +0200
  • 8a7d6ff51a
    ggml alloc: Fix for null dereference on alloc failure (llama/5200) Paul Tsochantaris 2024-01-29 22:19:29 +0000
  • 25f650a8e8
    Nomic Vulkan backend (llama/4456) Jared Van Bortel 2024-01-29 15:50:50 -0500
  • 44e517f074
    ggml : add max buffer sizes to opencl and metal backends (llama/5181) slaren 2024-01-29 09:05:13 +0100
  • cb9de61659
    metal : free metal objects (llama/5161) Paul Tsochantaris 2024-01-28 19:50:16 +0000
  • a2ef80d66f
    gguf : fix comparison (ggml/715) Georgi Gerganov 2024-01-29 21:08:18 +0200
  • baa190446a
    `ggml_cuda_cpy` support for 4d tensors and float16->float32 upcasting (ggml/686) John Balis 2024-01-29 06:37:33 -0600
  • 8f5220d81f
    gguf : add input validation, prevent integer overflows (ggml/709) Georgi Gerganov 2024-01-29 14:00:10 +0200
  • 8e391fcf3a
    ci : fix yolo URLs + fix metal capture (ggml/712) Georgi Gerganov 2024-01-29 13:29:46 +0200
  • 593657054e
    metal : add debug capture backend function (ggml/694) Jack Mousseau 2024-01-29 01:22:23 -0800
  • ae5c4f7340
    common : fix wav buffer detection (#1819) JacobLinCool 2024-01-31 01:35:08 +0800
  • 4bb14d18f7 common: fix wav buffer detection JacobLinCool 2024-01-30 23:32:51 +0800
  • baa30bacdb
    server : add fields to `verbose_json` response (#1802) JacobLinCool 2024-01-30 20:15:55 +0800
  • 3e6fad07aa
    make : update MSYS_NT (#1813) jwijffels 2024-01-30 13:13:49 +0100
  • 272506f68d server: add simple demo form to the homepage JacobLinCool 2024-01-30 04:36:41 +0800
  • 71617626a5 server: todo note for compression_ratio and no_speech_prob JacobLinCool 2024-01-30 04:14:25 +0800
  • 90c22faeec Use gradle properties Neuman Vong 2024-01-29 15:16:21 +1100
  • a2867fa06b Specify GGML build options in build.gradle Neuman Vong 2024-01-28 23:08:57 +1100
  • f0d83917df Documentation and make optional Neuman Vong 2024-01-25 17:28:21 +1100
  • 41d1ca658f OpenCL Neuman Vong 2024-01-25 16:21:59 +1100
  • 054697e922 FetchContent Neuman Vong 2024-01-25 14:54:05 +1100
  • e72e4158de
    talk-llama : sync llama.cpp Georgi Gerganov 2024-01-28 19:44:10 +0200
  • bd41733db2
    sync : ggml Georgi Gerganov 2024-01-28 19:30:32 +0200
  • 23c648e98d
    ggml : add Vulkan backend (llama/2059) 0cc4m 2024-01-28 18:03:59 +0100
  • 75ab2d06f5
    ggml : add unified SYCL backend for Intel GPUs (llama/2690) Abhilash Majumder 2024-01-28 21:26:23 +0530
  • adc099edee
    ggml : minor type fix (int64_t -> size_t) Georgi Gerganov 2024-01-28 18:44:58 +0200
  • 0f976456a5
    Update Makefile jwijffels 2024-01-27 19:52:46 +0100
  • 52cce82493
    common : fix input buffer check (#1812) Georgi Gerganov 2024-01-27 17:33:09 +0200
  • f60a61e272
    common : fix input buffer check Georgi Gerganov 2024-01-27 17:31:10 +0200
  • ef3c9ed9eb
    talk-llama : sync llama.cpp Georgi Gerganov 2024-01-27 17:24:53 +0200
  • 7fe3ed5e00
    sync : ggml Georgi Gerganov 2024-01-27 17:23:25 +0200
  • 6061241292
    Add OpenCL add kernel (llama/5151) 0cc4m 2024-01-26 23:07:32 +0100
  • 0878ab7c15
    cuda : fix tensor size calculation for non-split buffer (llama/5145) slaren 2024-01-26 18:59:43 +0100
  • c65edd5b64
    ggml-alloc : add 10% margin to the buffer sizes (llama/5149) slaren 2024-01-26 18:18:26 +0100
  • 3c8d14e9c5
    ggml : update softmax n_task calculation (llama/5126) snadampal 2024-01-26 11:17:59 -0600
  • c3977cb2ce
    metal : remove unused `n_buffers` and `buffers` (llama/5129) Paul Tsochantaris 2024-01-26 12:16:07 +0000
  • 6da1661bc2
    metal : show compile log messages Georgi Gerganov 2024-01-25 11:26:17 +0200
  • cc56540661
    cuda : fix 2-bit quants on amd hip (llama/5105) Engininja2 2024-01-24 16:18:15 -0600
  • 94c1ae8668
    llama : pre-allocate input tensors in a separate buffer (llama/5100) slaren 2024-01-24 12:48:14 +0100
  • 55d54359e0
    metal : disable support for MUL_MAT F32 x F16 Georgi Gerganov 2024-01-23 15:50:56 +0200
  • d33c2ad354
    CUDA: more info when no device code (llama/5088) Johannes Gäßler 2024-01-23 13:31:56 +0100
  • 9afa7ff624
    minor : clean-up some warnings and style (llama/5094) Georgi Gerganov 2024-01-23 14:12:57 +0200
  • 0649289f02
    ggml : parallelize FP32 conversion when using BLAS (llama/5045) Reinforce-II 2024-01-22 21:15:08 +0800
  • aaeaa43878
    llava : MobileVLM support (llama/4954) XiaotaoChen 2024-01-22 21:09:35 +0800
  • 078b8e23bf
    llama : run all KQV ops on the CPU with no KV offload (llama/5049) slaren 2024-01-20 16:05:49 +0100
  • 74da3e1757
    cuda : fix compile error in jetson platform (llama/4975) Kylin 2024-01-20 15:01:46 +0800
  • 2d2c93a798
    ggml : check ggml_add src1 type (ggml/708) Judd 2024-01-26 21:04:01 +0800
  • 4bbb60efce
    docs : make model options / model install methods clearer (#1806) Michael Rienstra 2024-01-26 07:39:54 -0800
  • d42cc4e40e docs: minor changes Michael Rienstra 2024-01-24 10:16:19 -0800
  • 0e361edf97 docs: `$` shell prompt, consistently Michael Rienstra 2024-01-24 10:14:17 -0800
  • 94f39c28d2 undo Prettier formatter change Michael Rienstra 2024-01-23 23:32:41 -0800
  • 06b3eb05a4 make 3 options clearer Michael Rienstra 2024-01-23 15:32:43 -0800
  • 3822505f20 Clean up code block language identifiers Michael Rienstra 2024-01-23 15:16:50 -0800
  • 61655864f9 Make models more "discoverable" Michael Rienstra 2024-01-23 14:58:51 -0800
  • ba1305605d server: show request examples on home page JacobLinCool 2024-01-24 03:46:49 +0800
  • 76898416b6 server: include additional fields in the verbose_json response as OpenAI does JacobLinCool 2024-01-24 03:38:27 +0800
  • d20287764f
    Merge 0bb1cd6741 into 1cf679dec4 David 2024-01-22 07:57:36 -0800
  • 1cf679dec4
    cmake : make libwhisper.so position independent (#1792) trixirt 2024-01-22 05:02:35 -0800
  • 41026c1e4b
    cmake : temporary remove VLA check (#1795) Georgi Gerganov 2024-01-22 14:51:42 +0200
  • cda677e7dc
    Update unicode.h bobqianic 2024-01-20 22:08:05 +0000
  • a12d40f11d
    Update CMakeLists.txt bobqianic 2024-01-20 22:07:25 +0000
  • eda72d3552
    Add files via upload bobqianic 2024-01-20 21:54:45 +0000
  • 938691b1e4
    Update WhisperCppTest.java bobqianic 2024-01-20 21:54:09 +0000
  • a1e6f202a7
    fix CI bobqianic 2024-01-20 21:31:35 +0000
  • 23f0b0b633
    Update Makefile bobqianic 2024-01-20 21:10:46 +0000
  • 3e05c3d1cb
    Merge branch 'master' into fix-decoding bobqianic 2024-01-20 21:01:27 +0000
  • 327a3dddc4
    Fix tokenizer (mostly) bobqianic 2024-01-20 20:57:31 +0000
  • 8c789cb39b libwhisper.so should be position independent Tom Rix 2024-01-20 08:38:15 -0500
  • 293b517c3e Generalize install locations Tom Rix 2024-01-20 08:00:46 -0500
  • d7a284c95a
    Merge ad65bad50b into d6b9be21d7 Vulcan 2024-01-19 10:37:41 -0800
  • d6b9be21d7
    whisper.android : return output from benchmarks (#1785) Neuman Vong 2024-01-20 01:17:38 +1100
  • 02c4f15990 whisper.android: Return output from benchmarks Neuman Vong 2024-01-19 17:04:21 +1100
  • c0329acde8
    server : implement "verbose_json" format with token details (#1781) Ryan Hitchman 2024-01-18 13:58:42 -0700
  • fc20ba1f68 server: use std::lock_guard instead of manual lock/unlock Ryan Hitchman 2024-01-18 12:26:41 -0700
  • ec39c4dacc server: don't write WAV to a temporary file if not converting Ryan Hitchman 2024-01-18 12:24:33 -0700
  • 4cab7a4e07 examples/server: implement "verbose_json" format with token details. Ryan Hitchman 2024-01-18 11:52:55 -0700