whisper.cpp

Commit Graph

Select branches

Hide Pull Requests

arghh

avx512

batched

bench-memcpy

chess

coreml-with-state

cuda-cublas-opts

diarization

distil-support

experiment/model-compression

fa-decoder

feature/debug-gradle-signing

fix-bench

fix-coreml-ane

fix-vzip

gg/alloc-enc-results

gg/chess

gg/fix-external-encoder

gg/prompt-tokens

gg/wchess

ggml-backend

ggml-backend-no-sched

ggml-conv

grammar-debug

guided

java-bindings

large-v3

llama-podcast

macros-cvt-fp16

master

metal

metal-and-alloc

nvblas

parallel-states

quantize-encoder

stream

talk.llama-coreml

threads

timing

try-fix-abort

word-ts-2

#1001

#1002

#1003

#1003

#1010

#1012

#1015

#102

#1021

#1021

#1024

#1027

#1029

#1031

#1032

#1034

#1037

#1041

#1042

#1045

#1046

#1049

#1054

#1058

#1060

#1062

#1064

#1067

#107

#1074

#1074

#1077

#1081

#1086

#1086

#1092

#1097

#1097

#110

#1101

#111

#1110

#1111

#1112

#1113

#1114

#1115

#1118

#1118

#1120

#1124

#1128

#1129

#1130

#1131

#1134

#1136

#1137

#114

#1142

#1143

#1144

#1147

#1148

#115

#1154

#116

#1160

#1162

#1164

#1164

#1173

#1174

#1196

#1204

#1205

#1209

#121

#1210

#1211

#1212

#1214

#1216

#1217

#1218

#1220

#1224

#1227

#1228

#1229

#123

#1231

#1235

#1238

#124

#1243

#1247

#1250

#1251

#1253

#1254

#1255

#1261

#1261

#1263

#1264

#1265

#1267

#127

#127

#1270

#1275

#128

#1286

#1290

#1293

#1294

#1298

#130

#130

#1303

#1304

#1305

#1306

#131

#1310

#1313

#1317

#1330

#1334

#1335

#1345

#1349

#135

#1350

#1352

#1356

#1358

#136

#1362

#1364

#1368

#1370

#1375

#1375

#1380

#1381

#1381

#1382

#1389

#1400

#1404

#141

#1415

#1417

#1418

#1418

#1420

#1422

#1424

#143

#1432

#1434

#1440

#1441

#1442

#1444

#1445

#1452

#1455

#1455

#1456

#1457

#1458

#1459

#1462

#1466

#1467

#147

#1472

#1473

#1474

#1475

#1478

#1478

#1479

#1484

#1485

#1486

#1487

#1492

#1493

#1499

#1499

#150

#1500

#1500

#1501

#1505

#1519

#1521

#1522

#1523

#1524

#1524

#1529

#1530

#1533

#1534

#1535

#1539

#1541

#1544

#1545

#1546

#1547

#1548

#1549

#1549

#155

#1551

#1554

#1559

#1559

#1560

#1561

#1563

#1563

#1565

#1567

#1568

#1574

#1575

#1576

#1578

#1582

#1583

#1586

#1588

#1589

#1595

#160

#1602

#1604

#1604

#1605

#1606

#1607

#1615

#1617

#1627

#1627

#163

#1633

#1649

#1649

#1650

#1651

#1655

#1658

#1667

#1669

#1672

#1673

#1674

#1675

#1677

#1679

#1679

#1681

#1691

#1692

#1694

#1695

#170

#1701

#1703

#1704

#1713

#1714

#1716

#1717

#1725

#1727

#1728

#1729

#1735

#174

#1740

#1741

#1744

#1747

#1749

#175

#1750

#1753

#1754

#1755

#1758

#1763

#1764

#1765

#1768

#1768

#1772

#1774

#1778

#1778

#1781

#1785

#179

#1791

#1791

#1792

#1802

#1806

#1809

#1812

#1813

#1819

#1823

#1823

#183

#1833

#1833

#1838

#1839

#1840

#1841

#1841

#1842

#1850

#1854

#1854

#1857

#1859

#1860

#1861

#1863

#1865

#1871

#1872

#1874

#1878

#1888

#1889

#1890

#1891

#1895

#1897

#19

#1902

#1913

#1913

#1917

#1924

#1924

#1925

#1926

#1928

#1929

#193

#1932

#1933

#1938

#194

#1942

#1943

#1944

#1945

#1947

#195

#1952

#1952

#1953

#1964

#1965

#1966

#1969

#1969

#1970

#1973

#1973

#1978

#1978

#1980

#1981

#1982

#1983

#1983

#1990

#1990

#1994

#1994

#1997

#1997

#1998

#1998

#20

#2000

#2000

#201

#203

#203

#21

#222

#224

#228

#229

#23

#230

#231

#239

#24

#245

#252

#253

#254

#257

#260

#27

#271

#273

#274

#277

#28

#282

#284

#284

#285

#286

#287

#288

#29

#291

#294

#296

#298

#299

#3

#301

#302

#306

#308

#31

#317

#318

#319

#320

#322

#323

#324

#331

#336

#34

#340

#343

#343

#345

#346

#349

#350

#351

#353

#357

#359

#36

#362

#365

#366

#368

#369

#379

#38

#381

#383

#384

#387

#388

#390

#391

#398

#404

#409

#41

#415

#42

#424

#425

#43

#431

#435

#436

#439

#443

#444

#446

#451

#453

#454

#454

#455

#456

#459

#461

#462

#468

#473

#474

#476

#482

#484

#485

#486

#494

#495

#497

#500

#501

#502

#502

#503

#506

#515

#520

#523

#532

#534

#537

#538

#540

#542

#552

#563

#566

#569

#572

#576

#58

#583

#60

#600

#605

#613

#613

#615

#619

#624

#624

#626

#627

#628

#629

#629

#638

#640

#642

#645

#648

#649

#650

#650

#659

#659

#664

#668

#67

#677

#682

#685

#686

#687

#688

#697

#70

#704

#706

#710

#711

#712

#716

#718

#72

#720

#721

#725

#728

#733

#737

#739

#740

#755

#759

#760

#763

#764

#768

#77

#776

#78

#798

#81

#810

#810

#811

#812

#815

#816

#832

#833

#834

#835

#836

#837

#842

#845

#853

#854

#862

#863

#867

#87

#871

#871

#874

#875

#883

#885

#890

#891

#891

#893

#899

#902

#908

#910

#915

#926

#927

#931

#935

#939

#939

#94

#944

#95

#956

#964

#968

#968

#971

#971

#972

#995

0.0.5-3

0.0.6-1

1.0.3

1.0.4

1.1.0

1.4.1-1

1.4.1-2

1.5.2

v1.0.4

v1.1.0

v1.1.1

v1.2.0

v1.2.1

v1.3.0

v1.4.0

v1.4.1

v1.4.2

v1.4.3

v1.5.0

v1.5.1

v1.5.2

v1.5.3

v1.5.4

fb466b3417

ggml : sync ggml-metal.m Georgi Gerganov 2024-01-18 11:03:13 +0200
1f50a7d29f

sync : llama.cpp Georgi Gerganov 2024-01-17 21:23:33 +0200
1de21b913d

sync : ggml Georgi Gerganov 2024-01-17 21:22:38 +0200
4aea058e5a

ggml : add IQ2 to test-backend-ops + refactoring (llama/4990) Georgi Gerganov 2024-01-17 18:54:56 +0200
fd10234363

imatrix : offload to GPU support (llama/4957) Georgi Gerganov 2024-01-17 18:46:30 +0200
8fb5c6a409

backend : add eval callback (llama/4935) Georgi Gerganov 2024-01-17 18:39:41 +0200
2fe5fbfcc2

metal : create autorelease pool during library build (llama/4970) Georgi Gerganov 2024-01-17 18:38:39 +0200
01637e1a4c

ggml : importance matrix support for legacy quants (llama/4969) Kawrakow 2024-01-16 19:51:26 +0200
1b349eb1f9

metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (llama/4936) Alex Azarov 2024-01-16 14:33:02 +0100
138eaebead

ggml : introduce GGML_CALL function annotation (llama/4850) Justine Tunney 2024-01-16 03:16:33 -0800
61b9192f27

cuda : fix dequantize kernel names (llama/4938) Georgi Gerganov 2024-01-15 13:27:00 +0200
161b51d91a

CUDA: faster dequantize kernels for Q4_0 and Q4_1 (llama/4938) Kawrakow 2024-01-15 07:48:06 +0200
f904b31a7d

Add ability to use importance matrix for all k-quants (llama/4930) Kawrakow 2024-01-14 16:21:12 +0200
2676819cb5

edit some comments bobqianic 2024-01-16 23:32:55 +0000
41df3f010a

Remove hallucination by using `token_nosp` bobqianic 2024-01-16 22:15:27 +0000
f6614155e4

talk-llama : optional wake-up command and audio confirmation (#1765) Benjamin Heiniger 2024-01-16 14:52:01 +0100
5ea1d91310

Merge branch 'ggerganov:master' into fix-decoding bobqianic 2024-01-15 23:51:50 +0000
271c321bc5

Revert some changes bobqianic 2024-01-15 23:48:30 +0000
80589d2bf2

Revert some changes bobqianic 2024-01-15 23:46:43 +0000
b5c4d5cd46

Add files via upload bobqianic 2024-01-15 21:47:05 +0000
3818acbbcc

Add files via upload bobqianic 2024-01-15 21:46:15 +0000
4b3a21143e

Fix ruby and go bindings bobqianic 2024-01-15 21:39:19 +0000
5e2c820fd1

Update Makefile bobqianic 2024-01-15 21:18:54 +0000
96a9349f1a

Add files via upload bobqianic 2024-01-15 20:24:20 +0000
7047d32141

Merge pull request #2 from bobqianic/patch bobqianic 2024-01-15 20:16:13 +0000
c8528a7c10

Add files via upload bobqianic 2024-01-15 20:14:52 +0000
9d0ebd193f

Add files via upload bobqianic 2024-01-15 20:13:58 +0000
6648641e1b

Add files via upload bobqianic 2024-01-15 19:44:09 +0000
7499e3c8ec

Merge pull request #1 from bobqianic/bobqianic-patch-1 bobqianic 2024-01-15 19:42:08 +0000
dfef69ef49

Delete server directory bobqianic 2024-01-15 19:41:44 +0000
c53c33b6b0

revert change bobqianic 2024-01-15 19:41:10 +0000
1226204af2

Add files via upload bobqianic 2024-01-15 19:38:02 +0000
28f10498af

Update Makefile bobqianic 2024-01-15 18:23:47 +0000
f5f159c320

server : fix building and simplify lib deps on Windows (#1772) Przemysław Pawełczyk 2024-01-15 14:48:13 +0100
5036229f41 cmake : simplify server example lib deps on Windows Przemyslaw Pawelczyk 2024-01-15 13:45:13 +0100
346ea9304d make : fix server example building on MSYS2 environments (Windows) Przemyslaw Pawelczyk 2024-01-15 13:25:38 +0100
7c15a462bc

Merge branch 'ggerganov:master' into master Benjamin Heiniger 2024-01-14 20:42:53 +0100
076d1e1d78

talk-llama.cpp: fix Windows build Benjamin Heiniger 2024-01-14 20:41:17 +0100
6ebba525f1

talk-llama : sync llama.cpp Georgi Gerganov 2024-01-14 18:08:20 +0200
8301f8874b

Add files via upload bobqianic 2024-01-14 15:53:40 +0000
71a65e7b7a

Add files via upload bobqianic 2024-01-14 15:14:35 +0000
2a5874441d

talk-llama : llama.cpp Georgi Gerganov 2024-01-14 11:06:28 +0200
d08445c9ad

sync : ggml Georgi Gerganov 2024-01-14 10:55:18 +0200
4a945696cb

metal : correctly set SIMD support flags on iOS (llama/4923) Alex Azarov 2024-01-14 09:44:39 +0100
dabc964d83

2-bit quantizations (llama/4897) Kawrakow 2024-01-14 09:45:56 +0200
654baf693d

scripts : sync-ggml-am.sh add option to skip commits Georgi Gerganov 2024-01-14 10:53:19 +0200
0644da442d

talk-llama: fix small formatting issue in output Benjamin Heiniger 2024-01-14 04:09:39 +0100
f2c2ff9d67

talk-llama: add optional audio confirmation before generating answer Benjamin Heiniger 2024-01-14 03:31:39 +0100
e93891833d

talk-llama: add optional wake-word detection from command Benjamin Heiniger 2024-01-14 03:29:50 +0100
f001a3b7b6

talk-llama : sync llama.cpp Georgi Gerganov 2024-01-14 00:13:17 +0200
c615f2c335

sync : ggml Georgi Gerganov 2024-01-14 00:12:17 +0200
d839dd0242

examples : adapt to metal API Georgi Gerganov 2024-01-14 00:09:26 +0200
435847891c

ggml: cache sin/cos for RoPE (llama/4908) Johannes Gäßler 2024-01-13 21:41:37 +0100
182f290808

metal : remove old API (llama/4919) Georgi Gerganov 2024-01-13 20:45:45 +0200
447dfc11fc

metal : disable log for loaded kernels (llama/4794) Georgi Gerganov 2024-01-13 18:46:37 +0200
9aa9f3b84e

gguf : fix potential infinite for-loop (llama/4600) texmex76 2024-01-13 17:06:20 +0100
396ebd1e80

metal : refactor kernel loading code (llama/4794) Georgi Gerganov 2024-01-13 18:03:45 +0200
12490f4398

CUDA: faster q8_0 -> f16 dequantization (llama/4895) Johannes Gäßler 2024-01-12 20:38:54 +0100
db078a9ba8

talk-llama : add optional CLI arg to set the bot name (#1764) RhinoDevel 2024-01-13 19:51:35 +0100
2d1b58f617 Add optional commandline parameter to set the bot name. Marc 2024-01-13 19:38:17 +0100
a13a7da5ad

examples : add python example for transcription (#1744) james wolf 2024-01-13 12:37:18 -0500
dcaba63a64 moved python files to examples/python contractorwolf 2024-01-13 12:28:26 -0500
519f8e8684

whisper : load the model into multiple buffers of max size 1GB (#1763) Georgi Gerganov 2024-01-13 17:47:40 +0200
49edad37d1

whisper : load the model into multiple buffers of max size 1GB Georgi Gerganov 2024-01-13 15:45:38 +0200
40ae0962f4

talk-llama : sync llama.cpp Georgi Gerganov 2024-01-12 22:04:51 +0200
1560288048

sync : ggml Georgi Gerganov 2024-01-12 21:56:50 +0200
1ad6fafd91

backend_sched : fix assignments slaren 2024-01-12 20:38:34 +0100
70840aed5f

llama : ggml-backend integration (llama/4766) slaren 2024-01-12 20:07:38 +0100
b24d18feb9

CUDA: fix softmax compile for old CUDA versions (llama/4862) Johannes Gäßler 2024-01-12 12:30:41 +0100
3fa98f4395

Importance Matrix calculation (llama/4861) Kawrakow 2024-01-12 06:59:57 +0100
d05b7ee90e

models : make all scripts to be POSIX Compliant (#1725) Sơn Phan Trung 2024-01-12 19:11:04 +0700
6dcee35129

ggml : fix 32-bit ARM compat for IQ2_XS (#1758) Georgi Gerganov 2024-01-12 14:02:30 +0200
563c7f1687

ggml : fix fix fix Georgi Gerganov 2024-01-12 14:00:49 +0200
2d55685bad

ggml : fix fix Georgi Gerganov 2024-01-12 13:59:18 +0200
4b87469aee

ggml : fix 32-bit ARM compat Georgi Gerganov 2024-01-12 13:58:19 +0200
5cb345f5e9

go : add SetInitialPrompt method to bindings (#1753) Boris Bliznioukov 2024-01-12 14:44:50 +0300
fbcb52d3cd

server : add more parameters to server api (#1754) George Hindle 2024-01-12 11:42:52 +0000
6b01e3fedd

whisper : fix segment length with params.no_timestamps == true Georgi Gerganov 2024-01-12 13:37:38 +0200
f7908f9bb8

params : don't compute timestamps when not printing them (#1755) George Hindle 2024-01-12 11:24:38 +0000
dfc7c63124

Merge branch 'ggerganov:master' into master Boris Bliznioukov 2024-01-12 10:16:26 +0300
00b7a4be02

talk-llama : sync llama.cpp Georgi Gerganov 2024-01-11 22:10:10 +0200
04b0a768b8

swift : remove local ggml.h reference Georgi Gerganov 2024-01-11 22:00:12 +0200
87670425f2

swift : track ggml release branch Georgi Gerganov 2024-01-11 21:57:40 +0200
32e71a1861

sync : ggml Georgi Gerganov 2024-01-11 21:54:17 +0200
9c857cf280

sync : llama.cpp Georgi Gerganov 2024-01-11 21:49:13 +0200
97b12212dd

ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856) Kawrakow 2024-01-11 20:39:39 +0100
9fa34d79ec

metal : put encoder debug group behind a define (llama/4873) Paul Tsochantaris 2024-01-11 14:31:52 +0000
a0a64a19dd

metal : improve dequantize precision to match CPU (llama/4836) Georgi Gerganov 2024-01-09 19:37:08 +0200
bbc23611fa

ggml : fix vld1q_s8_x4 32-bit compat (llama/4828) Georgi Gerganov 2024-01-09 10:42:06 +0200
e9783a1fb4

CUDA: faster softmax via shared memory + fp16 math (llama/4742) Johannes Gäßler 2024-01-09 08:58:55 +0100
9e0cc28792

metal : fix deprecation warning (ggml/690) Georgi Gerganov 2024-01-11 09:34:59 +0200
73072a7c73

ggml : remove ggml_cpy_inplace and ggml_cont_inplace (ggml/693) Timothy Cronin 2024-01-11 02:27:48 -0500
a8ba1262ff

metal : wrap each operation in debug group (ggml/690) Jack Mousseau 2024-01-10 06:19:19 -0800
e66a9a7806

ggml : change GGML_MAX_NAME at compile time (ggml/682) leejet 2024-01-10 21:13:42 +0800
338442d773

Fix execlp call (ggml/689) Halalaluyafail3 2024-01-09 11:16:37 -0500
10651bddf6

SOTA 2-bit quants (llama/4773) Kawrakow 2024-01-08 16:02:32 +0100
53d4d0b30d

CUDA: fixed redundant value dequantization (llama/4809) Johannes Gäßler 2024-01-07 17:24:08 +0100
2865e4710b

ggml : use __builtin_amdgcn_sudot4 in __dp4a for gfx11 (llama/4787) Konstantin Zhuravlyov 2024-01-07 01:52:42 -0500
c46a74a19d

ggml : do not sched_yield when calling BLAS (llama/4761) Georgi Gerganov 2024-01-05 15:18:21 +0200
46dc49a6a1

ggml : include stdlib.h before intrin.h (llama/4736) Georgi Gerganov 2024-01-04 10:12:26 +0200