whisper.cpp

Commit Graph

Select branches

Hide Pull Requests

arghh

avx512

batched

bench-memcpy

chess

coreml-with-state

cuda-cublas-opts

diarization

distil-support

experiment/model-compression

fa-decoder

feature/debug-gradle-signing

fix-bench

fix-coreml-ane

fix-vzip

gg/alloc-enc-results

gg/chess

gg/fix-external-encoder

gg/prompt-tokens

gg/wchess

ggml-backend

ggml-backend-no-sched

ggml-conv

grammar-debug

guided

java-bindings

large-v3

llama-podcast

macros-cvt-fp16

master

metal

metal-and-alloc

nvblas

parallel-states

quantize-encoder

stream

talk.llama-coreml

threads

timing

try-fix-abort

word-ts-2

#1001

#1002

#1003

#1003

#1010

#1012

#1015

#102

#1021

#1021

#1024

#1027

#1029

#1031

#1032

#1034

#1037

#1041

#1042

#1045

#1046

#1049

#1054

#1058

#1060

#1062

#1064

#1067

#107

#1074

#1074

#1077

#1081

#1086

#1086

#1092

#1097

#1097

#110

#1101

#111

#1110

#1111

#1112

#1113

#1114

#1115

#1118

#1118

#1120

#1124

#1128

#1129

#1130

#1131

#1134

#1136

#1137

#114

#1142

#1143

#1144

#1147

#1148

#115

#1154

#116

#1160

#1162

#1164

#1164

#1173

#1174

#1196

#1204

#1205

#1209

#121

#1210

#1211

#1212

#1214

#1216

#1217

#1218

#1220

#1224

#1227

#1228

#1229

#123

#1231

#1235

#1238

#124

#1243

#1247

#1250

#1251

#1253

#1254

#1255

#1261

#1261

#1263

#1264

#1265

#1267

#127

#127

#1270

#1275

#128

#1286

#1290

#1293

#1294

#1298

#130

#130

#1303

#1304

#1305

#1306

#131

#1310

#1313

#1317

#1330

#1334

#1335

#1345

#1349

#135

#1350

#1352

#1356

#1358

#136

#1362

#1364

#1368

#1370

#1375

#1375

#1380

#1381

#1381

#1382

#1389

#1400

#1404

#141

#1415

#1417

#1418

#1418

#1420

#1422

#1424

#143

#1432

#1434

#1440

#1441

#1442

#1444

#1445

#1452

#1455

#1455

#1456

#1457

#1458

#1459

#1462

#1466

#1467

#147

#1472

#1473

#1474

#1475

#1478

#1478

#1479

#1484

#1485

#1486

#1487

#1492

#1493

#1499

#1499

#150

#1500

#1500

#1501

#1505

#1519

#1521

#1522

#1523

#1524

#1524

#1529

#1530

#1533

#1534

#1535

#1539

#1541

#1544

#1545

#1546

#1547

#1548

#1549

#1549

#155

#1551

#1554

#1559

#1559

#1560

#1561

#1563

#1563

#1565

#1567

#1568

#1574

#1575

#1576

#1578

#1582

#1583

#1586

#1588

#1589

#1595

#160

#1602

#1604

#1604

#1605

#1606

#1607

#1615

#1617

#1627

#1627

#163

#1633

#1649

#1649

#1650

#1651

#1655

#1658

#1667

#1669

#1672

#1673

#1674

#1675

#1677

#1679

#1679

#1681

#1691

#1692

#1694

#1695

#170

#1701

#1703

#1704

#1713

#1714

#1716

#1717

#1725

#1727

#1728

#1729

#1735

#174

#1740

#1741

#1744

#1747

#1749

#175

#1750

#1753

#1754

#1755

#1758

#1763

#1764

#1765

#1768

#1768

#1772

#1774

#1778

#1778

#1781

#1785

#179

#1791

#1791

#1792

#1802

#1806

#1809

#1812

#1813

#1819

#1823

#1823

#183

#1833

#1833

#1838

#1839

#1840

#1841

#1841

#1842

#1850

#1854

#1854

#1857

#1859

#1860

#1861

#1863

#1865

#1871

#1872

#1874

#1878

#1888

#1889

#1890

#1891

#1895

#1897

#19

#1902

#1913

#1913

#1917

#1924

#1924

#1925

#1926

#1928

#1929

#193

#1932

#1933

#1938

#194

#1942

#1943

#1944

#1945

#1947

#195

#1952

#1952

#1953

#1964

#1965

#1966

#1969

#1969

#1970

#1973

#1973

#1978

#1978

#1980

#1981

#1982

#1983

#1983

#1990

#1990

#1994

#1994

#1997

#1997

#1998

#1998

#20

#2000

#2000

#201

#203

#203

#21

#222

#224

#228

#229

#23

#230

#231

#239

#24

#245

#252

#253

#254

#257

#260

#27

#271

#273

#274

#277

#28

#282

#284

#284

#285

#286

#287

#288

#29

#291

#294

#296

#298

#299

#3

#301

#302

#306

#308

#31

#317

#318

#319

#320

#322

#323

#324

#331

#336

#34

#340

#343

#343

#345

#346

#349

#350

#351

#353

#357

#359

#36

#362

#365

#366

#368

#369

#379

#38

#381

#383

#384

#387

#388

#390

#391

#398

#404

#409

#41

#415

#42

#424

#425

#43

#431

#435

#436

#439

#443

#444

#446

#451

#453

#454

#454

#455

#456

#459

#461

#462

#468

#473

#474

#476

#482

#484

#485

#486

#494

#495

#497

#500

#501

#502

#502

#503

#506

#515

#520

#523

#532

#534

#537

#538

#540

#542

#552

#563

#566

#569

#572

#576

#58

#583

#60

#600

#605

#613

#613

#615

#619

#624

#624

#626

#627

#628

#629

#629

#638

#640

#642

#645

#648

#649

#650

#650

#659

#659

#664

#668

#67

#677

#682

#685

#686

#687

#688

#697

#70

#704

#706

#710

#711

#712

#716

#718

#72

#720

#721

#725

#728

#733

#737

#739

#740

#755

#759

#760

#763

#764

#768

#77

#776

#78

#798

#81

#810

#810

#811

#812

#815

#816

#832

#833

#834

#835

#836

#837

#842

#845

#853

#854

#862

#863

#867

#87

#871

#871

#874

#875

#883

#885

#890

#891

#891

#893

#899

#902

#908

#910

#915

#926

#927

#931

#935

#939

#939

#94

#944

#95

#956

#964

#968

#968

#971

#971

#972

#995

0.0.5-3

0.0.6-1

1.0.3

1.0.4

1.1.0

1.4.1-1

1.4.1-2

1.5.2

v1.0.4

v1.1.0

v1.1.1

v1.2.0

v1.2.1

v1.3.0

v1.4.0

v1.4.1

v1.4.2

v1.4.3

v1.5.0

v1.5.1

v1.5.2

v1.5.3

v1.5.4

4f88940ff6

Add q3_s and q1_s (llama/5886) Abhilash Majumder 2024-03-11 10:27:56 +0530
7bdb1de9ec

metal : move mm_id indices to shared mem (llama/5982) Georgi Gerganov 2024-03-10 23:12:48 +0200
653d2e8ff9

ggml : fix unnecessary f32 -> f16 -> f32 casts (mmla) (llama/5951) Georgi Gerganov 2024-03-09 17:36:20 +0200
2fef660d0a

ggml : remove old quantization functions (llama/5942) Georgi Gerganov 2024-03-09 15:53:59 +0200
24eba5a2ff

ggml : add ggml-common.h to deduplicate shared code (llama/5940) Georgi Gerganov 2024-03-09 12:47:57 +0200
6e9d3aa32d

llama : support Mamba Selective State Space Models (llama/5328) compilade 2024-03-08 17:31:00 -0500
9ae0d18856

extra : update sync scripts after ggml-common.h Georgi Gerganov 2024-03-15 14:00:53 +0200
56102531b1 Fix aheads_masks_init for backend != CPU Dener Stassun 2024-03-14 18:29:13 +0000
9fa298f9d5 Fix incorrect n_frames passed to dtw when near end of audio Dener Stassun 2024-03-14 15:15:51 -0300
3283ad1830 return -1 to avoid confusion zhou.weiguo 2024-03-13 09:50:09 +0800
10b0304a59 dtw: cleanup Dener Stassun 2024-03-12 09:53:59 -0300
87f2620788 Copying cross QKs from decoder backend correctly Dener Stassun 2024-03-07 14:55:57 -0300
eb531c7d32 Reimpl aheads n_top_most and custom. Sanity checks on chosen aheads Dener Stassun 2024-03-06 11:31:28 -0300
3016444b7c Calling median filter with ggml_map_custom1 Dener Stassun 2024-03-05 10:19:13 -0300
9a19200e22 decoder: save cross QKs only if requested Dener Stassun 2024-03-05 09:06:26 -0300
641fb2c380 Fixed excessive memory use when using DTW timestamps. Other minor fixes to DTW timestamping function Dener Stassun 2024-03-01 13:28:07 -0300
4de1ed4b40 Fix issues related to changes in whisper.cpp Dener Stassun 2024-01-11 10:03:40 -0300
dfb24a4dab whisper: fix typo on alignment heads enum Dener Stassun 2023-12-14 14:22:33 -0300
3a5f368ca4 implement N_TOP_MOST and CUSTOM alignment heads setting Dener Stassun 2023-12-11 09:29:09 -0300
93eb345b14 Fix mistake causing incorrect alignment of dtw timestamps Dener Stassun 2023-12-07 16:20:53 -0300
f11ff92533 Fix compile and assertion errors. Attempt to DTW timestamp with single_segment=false. Dener Stassun 2023-12-07 08:40:57 -0300
b69f5b4de3 WIP: producing and placing DTW timestamps on tokens Dener Stassun 2023-12-04 16:47:17 -0300
cd52c5ae10 whisper.cpp: impl dtw algo Dener Stassun 2023-11-13 16:47:36 -0300
4e6c281192

Merge branch 'ggerganov:master' into feat/progress 华丽 2024-03-12 13:32:31 +0800
a56f435fd4

whisper : document whisper_batch.n_seq_id (#1942) Josh Bleecher Snyder 2024-03-10 07:55:22 -0700
ec166499d8

whisper : improve beam search candidate diversity (#1947) Josh Bleecher Snyder 2024-03-10 07:54:43 -0700
b204d7cc24 whisper : document whisper_batch.n_seq_id Josh Bleecher Snyder 2024-03-08 09:33:19 -0800
f99b649a92 whisper : improve beam search candidate diversity Josh Bleecher Snyder 2024-03-09 18:30:54 -0800
ccf022f970

bindings/go : add linker flags to make metal work (#1944) Josh Bleecher Snyder 2024-03-09 08:50:44 -0800
2852e1af55

whisper : make beam candidate sort more stable (#1943) Josh Bleecher Snyder 2024-03-09 08:50:03 -0800
a22a8684cd fix typo in examples/bench/bench.cpp zhou.weiguo 2024-03-09 09:00:07 +0800
ce945b50c3

ggml : try fix 32-bit arm compat (#1938) Georgi Gerganov 2024-03-08 23:45:07 +0200
c13b1dca84 bindings/go : add linker flags to make metal work Josh Bleecher Snyder 2024-03-05 08:27:02 -0800
7c040b440c whisper : make beam candidate sort more stable Josh Bleecher Snyder 2024-03-08 12:57:18 -0800
faba65159e

ggml : fix cont Georgi Gerganov 2024-03-08 17:45:26 +0200
2abc2d70f2

ggml : try fix 32-bit arm compat Georgi Gerganov 2024-03-08 13:48:20 +0200
2f5a5a66dd

talk-llama : use llama_decode instead of llama_eval Georgi Gerganov 2024-03-08 12:04:43 +0200
8e409d1113

talk-llama : sync llama.cpp Georgi Gerganov 2024-03-08 11:55:50 +0200
05d1b61af4

talk-llama : sync llama.cpp Georgi Gerganov 2024-03-08 11:52:47 +0200
647cae178a

sync : ggml Georgi Gerganov 2024-03-08 11:39:34 +0200
bae7c23fbf

Revert "[SYCL] fix error when set main gpu to non-zero (llama/5901)" (llama/5918) Neo Zhang Jianyu 2024-03-07 19:14:49 +0800
18ea187d42

fix error when set main gpu to non-zero (llama/5901) Neo Zhang Jianyu 2024-03-07 16:34:31 +0800
1daeffca54

ggml : use SYS_get_cpu if SYS_getcpu is not defined (llama/5906) Jared Van Bortel 2024-03-06 15:42:23 -0500
2f6f1d4465

ggml : use `uint8x16_t` return type for `ggml_vqtbl1q_u8` (llama/5894) bobqianic 2024-03-06 07:35:07 +0000
7ff1894c34

add wait() to make code stable (llama/5895) Neo Zhang Jianyu 2024-03-06 12:08:32 +0800
8edfc54c2b

quants : use MM256_SET_M128I consistently to fix gcc 7 build (llama/5889) Jared Van Bortel 2024-03-05 11:56:37 -0500
9c399689ec

Vulkan Improvements (llama/5835) 0cc4m 2024-03-05 13:33:42 +0100
9d9a405cfd

fix mul_mat fault in CI/unit-test (llama/5862) Neo Zhang Jianyu 2024-03-05 16:08:35 +0800
edd8b38a75

ggml : fix unknown status (llama/0) Georgi Gerganov 2024-03-04 20:53:27 +0200
ed76818700

whisper : fix compute helper return (ggml/750) Georgi Gerganov 2024-03-05 16:05:23 +0200
9a0b59d990

ggml : introduce ggml_status (ggml/750) Michael Podvitskiy 2024-03-04 10:05:42 +0100
93a84a143b

cuda : fix data race in soft max (llama/5853) slaren 2024-03-03 14:26:18 +0100
bd26876267

ggml : fix IQ3_S AVX implementation (llama/5834) Georgi Gerganov 2024-03-02 20:00:49 +0200
21d295180d

ggml : IQ3_S improvements (llama/5829) Kawrakow 2024-03-02 17:00:51 +0200
c3bfc9bfda

Support multiple GPUs (split mode) on SYCL backend (llama/5806) Neo Zhang Jianyu 2024-03-02 19:49:30 +0800
422a6b16fc

ggml-vulkan: fix VULKAN_CHECK_RESULTS flag, which was previously broken (llama/5813) ddpasa 2024-03-01 18:00:00 +0100
11dd0d4482

Use batched mul_mat pathway (llama/5591) AidanBeltonS 2024-03-01 07:36:47 +0000
26dd2f06ac

make portability_enumeration_ext apple only (llama/5757) Eve 2024-02-28 19:33:37 +0000
8cee7c08b6

add some new ops, fix some operators and add batch operations to certain operators. (ggml/747) leejet 2024-03-03 20:23:52 +0800
0be62becec add build scripts of bench.cpp to generate bench tool for android-based device zhou.weiguo 2024-03-07 20:27:14 +0800
a99e5dab66 add build scripts of bench.cpp to generate bench tool for android-based device zhou.weiguo 2024-03-07 20:08:53 +0800
2e2626b167

examples : Auto lowercase language parameter in main.cpp (#1928) F1L1P 2024-03-06 23:25:10 +0100
c0c0ae2dea

examples : fix typo in bench.cpp (#1933) zhouwg 2024-03-07 06:21:44 +0800
b40606b996 bench:fix typo zhou.weiguo 2024-03-06 23:47:52 +0800
f03270a5e2 add build scripts for bench.cpp zhou.weiguo 2024-03-06 20:30:13 +0800
4b24c7d96e add build scripts for bench.cpp zhou.weiguo 2024-03-06 19:16:56 +0800
6ebff08221 add build scripts for bench.cpp zhou.weiguo 2024-03-06 18:17:26 +0800
a712a93d1c add build scripts for bench.cpp zhou.weiguo 2024-03-06 18:08:17 +0800
67c0a9fca1 add build scripts for bench.cpp zhou.weiguo 2024-03-06 18:02:11 +0800
8aa2e6a226

Update examples/main/main.cpp F1L1P 2024-03-05 21:07:17 +0100
06e73da378 refactor: typescript zcf0508 2024-03-05 23:12:46 +0800
897412b5b6

whisper : fix typo (#1925) zhouwg 2024-03-05 23:06:31 +0800
f22d27a385

whisper.android.java : fix returns in JNI (#1929) zhouwg 2024-03-05 21:59:26 +0800
98d895afa7 fix SF in JNI zhou.weiguo 2024-03-05 21:05:59 +0800
380c2bebfb Auto lowercase language parameter F1L1P 2024-03-05 11:46:07 +0100
076827069c refine original android demo zhou.weiguo 2024-03-05 13:26:44 +0800
a9db18b329 fix typo in whisper.cpp zhou.weiguo 2024-03-05 13:20:17 +0800
8472186df2 fix: avoid test fail zcf0508 2024-03-05 10:12:53 +0800
19b8436ef1

Merge branch 'master' into androidStreaming liam-mceneaney 2024-03-04 20:03:55 -0500
5bc5434f71 Android realtime whisper transcription attempt. liam-mceneaney 2024-03-04 19:36:42 -0500
ccd7c1d2da

cmake : add library versioning (#1352) kennethge 2024-03-04 14:17:48 -0500
2cc6a5c83c

Merge branch 'master' into add-versioning Georgi Gerganov 2024-03-04 21:17:21 +0200
c713eb5e2a

readme : recommend MacOS Sonoma for Core ML (#1917) Gavin Cai 2024-03-04 11:16:13 -0800
e7f9b70e56 Update README to Recommend MacOS Sonoma for Core ML to avoid hallucination Dongcheng Cai 2024-03-03 16:52:37 -0800
5ecffc9d8e feat: node addon support muti input files and callback function support provide progress zcf0508 2024-03-01 17:54:18 +0800
25d313b38b

talk-llama : sync llama.cpp Georgi Gerganov 2024-02-28 13:04:05 +0200
3168dbf23b

sync : ggml Georgi Gerganov 2024-02-28 13:01:33 +0200
1711bb3881

sync : llama.cpp (ggml/0) Georgi Gerganov 2024-02-28 12:59:11 +0200
2533305596

ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (llama/5760) Kawrakow 2024-02-28 10:37:02 +0200
0eca512ac8

Attempt to fix android build (llama/5752) Kawrakow 2024-02-27 19:16:49 +0200
013e394a4b

IQ4_XS: a 4.25 bpw quantization (llama/5747) Kawrakow 2024-02-27 16:34:24 +0200
d83f371b5f

cuda : replace remaining shfl_xor with calls to warp_reduce functions (llama/5744) Engininja2 2024-02-27 07:22:45 -0600
1c71816eab

ggml-quants : fix avx2 iq1_s vec_dot when compiled with gcc (llama/5742) Engininja2 2024-02-27 06:50:18 -0600
7b1d8ea7e0

Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (llama/5721) Kawrakow 2024-02-26 18:28:38 +0200
b1f7223a0a

CUDA: fix DEBUG_CUDA_MALLOC (llama/5729) Johannes Gäßler 2024-02-26 15:36:38 +0100
8408a4be8e

Add support for soft_max ALiBi (llama/5639) AidanBeltonS 2024-02-26 14:02:11 +0000
72849c24ba

ggml-quants : provide ggml_vqtbl1q_u8 for 64bit compatibility (llama/5711) Radosław Gryta 2024-02-25 19:43:00 +0100
c19c28be71

add google magika inference example (ggml/748) slaren 2024-02-25 20:41:35 +0100
0d8fd8483a

stream.wasm : fix invalid memory access when no segments (#1902) Andrew S 2024-02-26 02:12:35 -0600
3170841ed9

talk-llama : sync llama.cpp Georgi Gerganov 2024-02-25 20:00:10 +0200