whisper.cpp

mirror of https://github.com/ggml-org/whisper.cpp.git synced 2025-09-15 13:28:35 +08:00

Author	SHA1	Message	Date
Georgi Gerganov	fc45bb8625	talk-llama : sync llama.cpp ggml-ci	2025-08-18 20:30:45 +03:00
Georgi Gerganov	7fd2fbde45	common : handle mxfp4 enum ggml-ci	2025-08-18 20:30:45 +03:00
Daniel Bevenius	040510a132	node : add win platform check for require path (#3363 ) This commit adds a check to the platform in use and adjust the path to the addon.node shared library. The motivation for this change is that on windows addon.node library is built into build\bin\Release and on linux into build/Release. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3360	2025-08-15 14:54:23 +02:00
Georgi Gerganov	b02242d0ad	wasm : change ggml model host to HF (#3369 )	2025-08-10 13:00:17 +03:00
Daniel Bevenius	0becabc8d6	stream.wasm : add language selection support (#3354 ) * stream.wasm : add language selection support This commit adds support for selecting the language in the stream.wasm example. This is includes adding the model `base` which supports multilingual transcription, and allowing the user to select a language from a dropdown menu in the HTML interface. The motivation for this is that it allows users to transcribe audio in various languages. Refs: https://github.com/ggml-org/whisper.cpp/issues/3347 * squash! stream.wasm : add language selection support Remove strdup() for language in stream.wasm and update butten text for base (should not be "base.en" but just "base").	2025-08-02 07:03:04 +02:00
Georgi Gerganov	d0a9d8c7f8	talk-llama : sync llama.cpp	2025-07-28 13:02:32 +03:00
Daniel Bevenius	7de8dd783f	examples : add note about WHISPER_WASM_SINGLE_FILE [no ci] (#3332 ) This commit adds a note to the README files of the WASM examples about the `WHISPER_WASM_SINGLE_FILE` option. The motivation for this is that currently this option is not documented and might be surprising to users who expect a separate .wasm file to be generated. Refs: https://github.com/ggml-org/whisper.cpp/issues/3290	2025-07-24 16:06:48 +02:00
Sacha Arbonel	1f5cf0b288	server : hide language probabilities option behind flag (#3328 ) * examples/server: hide language probabilities option behind flag * code review * fix	2025-07-21 13:03:54 +02:00
Greg Sadetsky	a16da91365	examples : update links in wasm examples (#3318 ) * fix 404 link * update link in whisper.wasm example * update example in command.wasm * update link in bench.wasm example * update link in stream.wasm example	2025-07-12 23:22:35 +02:00
Georgi Gerganov	6ddff4d96a	talk-llama : sync llama.cpp ggml-ci	2025-07-12 19:23:56 +03:00
accessiblepixel	869335f2d5	server : add dtw.params for v3-large-turbo (#3307 ) * Add DTW model large-v3-turbo parameters to server.cpp example DTW support is available in whispercpp and the large-v3-turbo model has already been added to the sources, but the large-v3-turbo model hasn't been added to the server.cpp file to make use of it. This commit hopefully corrects that issue. * match original linebreak of original server.cpp file after adding large.v3.turbo dtw	2025-07-07 12:51:15 +03:00
Lin Xiaodong	d9999d54c8	feat: support vad for addon.node (#3301 ) Co-authored-by: linxiaodong <calm.lin@wukongsch.com>	2025-07-02 13:14:29 +03:00
Georgi Gerganov	1f816de7da	talk-llama : sync llama.cpp	2025-07-01 17:54:53 +03:00
Daniel Bevenius	32cf4e2aba	whisper : add version function (#3289 ) * whisper : add version function This commit adds a version function to the whisper API. The motivation for this is that it might be convenient to have a way to programmatically check the version. Example usage: ```c++ printf("Using whisper version: %s\n", whisper_version()); ``` Will output: ```console Using whisper version: 1.7.6 ``` * examples : add version to android example CMakeLists.txt	2025-06-26 18:09:42 +02:00
Georgi Gerganov	dc8dda60ee	bench : print system info before ctx check	2025-06-25 16:01:32 +03:00
Daniel Bevenius	1ad258ca31	stream : add nullptr check of whisper_context (#3283 ) * stream : add nullptr check of whisper_context This commit adds a check to ensure that the `whisper_context` is not null after initialization. The motivation for this is that currently, if the initialization fails, the program continues to run leading to a segmentation fault. This sort of check is performed by others examples like whisper-cli. Refs: https://github.com/ggml-org/whisper.cpp/issues/3280#issuecomment-3003778035 * examples : add nullptr check for whisper_context	2025-06-25 14:16:31 +02:00
Aaron Ang	4d6ae52ed3	command: output commands to text file (#3273 ) This commit implements code for the command line argument `-f --file FNAME` which is currently missing.	2025-06-24 06:41:21 +02:00
Georgi Gerganov	e6c10cf3d5	talk-llama : sync llama.cpp ggml-ci	2025-06-21 07:34:17 +03:00
Daniel Bevenius	3e65f518dd	android : update CMakeLists.txt to use FetchContent for ggml (#3268 ) * android : update CMakeLists.txt to use FetchContent for ggml This commit updates the CMakeLists.txt file for the Android Whisper example to use FetchContent for managing the ggml library. The motivation for this change is avoid having to make manual changes to the CMakeLists.txt file after syncing the ggml library. I've built and run the example locally to verify that it works as expected. Refs: https://github.com/ggml-org/whisper.cpp/pull/3265#issuecomment-2986715717 * android.java : update cmake to use FetchContent for ggml This commit updates the CMake configuration for the Android Java example to use `FetchContent` for including the `ggml` library. Do be able to use FetchContent we also update the `compileSdkVersion` and `targetSdkVersion` to 31, and the `buildToolsVersion` to '30.0.3'. This also required a an update to the Gradle plugin version to 7.4.0. The motivation for this change is avoid having to make manual changes to the CMakeLists.txt file after syncing the ggml library.	2025-06-19 16:06:42 +02:00
Georgi Gerganov	17bece1885	cmake : fix android build (#3265 ) * cmake : fix android build --------- Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>	2025-06-19 08:24:41 +02:00
Daniel Bevenius	ecb8f3c2b4	examples : add stereo to mono conversion in read_audio_data (#3266 ) This commit adds a conversion from stereo to mono in the `read_audio_data` function of `common-whisper.cpp`. The motivation for this change is prior to Commit `7d3da68f79` ("examples : use miniaudio for direct decoding flac, mp3, ogg and wav (#2759)", there was a step that read stereo int16 data -> pcm16 (448512 samples), and then converted to mono (224256 samples), and then also convert to stereo in `pcmf32s. The middle step here seems to have been missed when rewriting the code to use Miniaudio and caused issues then transcribing stereo audio files. For example, currently using the audio sample in the linked issue the output is: ```console [00:00:00.000 --> 00:00:03.000] (speaker 1) Sous-titres réalisés para la communauté d'Amara.org ``` And with the change in this commit the output is: ``` [00:00:00.000 --> 00:00:01.500] (speaker 1) sonnerie de téléphone [00:00:01.500 --> 00:00:07.000] (speaker 1) Salut jeune homme ! [00:00:07.000 --> 00:00:08.500] (speaker 0) C'est vrai que je te dérange ? [00:00:08.500 --> 00:00:10.500] (speaker 1) Ah pas du tout, pas du tout, pas du tout ! [00:00:10.500 --> 00:00:12.500] (speaker 1) J'étais en train de... [00:00:12.500 --> 00:00:14.500] (speaker 1) de préparer un courrier ``` Resolves: https://github.com/ggml-org/whisper.cpp/issues/3092	2025-06-18 17:41:43 +02:00
Georgi Gerganov	2f60ebc3c2	talk-llama : sync llama.cpp ggml-ci	2025-06-18 12:40:34 +03:00
Daniel Bevenius	f3ff80ea8d	examples : set the C++ standard to C++17 for server (#3261 ) This commit updates the server example to use C++17 as the standard. The motivation for this change is that currently the ci-run `ggml-100-mac-m4` is failing when compiling the server example on macOS. The `talk-llama` example also has this setting so it looks like an alright change to make. ggml-ci Refs: https://github.com/ggml-org/ci/tree/results/whisper.cpp/2a/4d6db7d90899aff3d58d70996916968e4e0d27/ggml-100-mac-m4	2025-06-17 11:29:48 +02:00
w1redch4d	2a4d6db7d9	examples : update usage/help in yt-wsp.sh (#3251 ) This commit updates the usage/help message to be more readable and include the environment variables available to set options.	2025-06-16 12:21:16 +02:00
Sacha Arbonel	107c303e69	server : graceful shutdown, atomic server state, and health endpoint Improvements (#3243 ) * feat(server): implement graceful shutdown and server state management * refactor(server): use lambda capture by reference in server.cpp	2025-06-16 10:14:26 +02:00
Daniel Bevenius	0a4d85cf8a	server : add Voice Activity Detection (VAD) support (#3246 ) * server : add Voice Activity Detection (VAD) support This commit adds support for Voice Activity Detection (VAD) in the server example. The motivation for this is to enable VAD processing when using whisper-server. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3089 * server : add VAD parameters to usage in README.md [no ci] This commit also adds a few missing parameters. * server : fix conflicting short options [no ci]	2025-06-13 13:24:03 +02:00
Daniel Bevenius	9df8d54bcb	cli : fix short name conflict for vad options [no ci] (#3247 ) This commit fixes a short name conflict whisper-cli for `--vad-min-speech-duration-ms` and `--vad-min-silence-duration-ms` which currently have the same short name `-vsd`. Refs: https://github.com/ggml-org/whisper.cpp/pull/3246#pullrequestreview-2923800114	2025-06-13 10:25:25 +02:00
Georgi Gerganov	962361bd79	android : fix builds (#0 ) ggml-ci	2025-06-10 12:40:33 +03:00
Georgi Gerganov	db264d6220	talk-llama : sync llama.cpp ggml-ci	2025-06-10 12:40:33 +03:00
Daniel Bevenius	b505539670	node : add language detection support (#3190 ) This commit add support for language detection in the Whisper Node.js addon example. It also updates the node addon to return an object instead of an array as the results. The motivation for this change is to enable the inclusion of the detected language in the result, in addition to the transcription segments. For example, when using the `detect_language` option, the result will now be: ```console { language: 'en' } ``` And if the `language` option is set to "auto", it will also return: ```console { language: 'en', transcription: [ [ '00:00:00.000', '00:00:07.600', ' And so my fellow Americans, ask not what your country can do for you,' ], [ '00:00:07.600', '00:00:10.600', ' ask what you can do for your country.' ] ] } ```	2025-06-02 14:58:05 +02:00
Georgi Gerganov	7fd6fa8097	talk-llama : sync llama.cpp ggml-ci	2025-06-01 15:14:44 +03:00
Daniel Bevenius	73a8c5fb94	whisper : remove whisper_load_backends function (#3196 ) * whisper : remove whisper_load_backends function This commit removes the `whisper_load_backends` function, which was used to load all GGML backends. The motivation for this change push the responsibility of loading backends to user applications to give them more control over which backends to load and when. See the references below for more context. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3182 Refs: https://github.com/ggml-org/whisper.cpp/pull/3042#issuecomment-2801778733 Refs: https://github.com/ggml-org/whisper.cpp/pull/3042#issuecomment-2801928990 * ruby : add check for rwc is NULL This commit adds a check to ensure that the `rwc` pointer is not NULL before attempting to mark its members in the garbage collector. The motivation for this is an attempt to see if this fixed the CI build as I'm not able to reproduce the issue locally. Refs: https://github.com/ggml-org/whisper.cpp/actions/runs/15299612277/job/43036694928?pr=3196	2025-05-29 08:03:17 +02:00
Georgi Gerganov	26eb48cb08	talk-llama : sync llama.cpp ggml-ci	2025-05-27 18:03:00 +03:00
Daniel Bevenius	450de0787e	node : enable no_prints to suppress all output (#3189 ) This commit enable the node addon to suppress all output, even the result of the transcription if the no_prints parameter is set to true. The motivation for this is that for the node addon there is a fullfilment handler/success callback to process the transcription result. And it might be useful to be able to disable the printing of the transcription result to the console, so that the user can handle the result in their own way. Refs: https://github.com/ggml-org/whisper.cpp/issues/3176	2025-05-27 05:51:47 +02:00
matteng1	ea9f206f18	talk-llama : fix for swedish umlauts + expose model inference settings in talk-llama.cpp (#3187 ) Quick fix for not removing swedish umlauts. * Update talk-llama.cpp Expose model inference settings to user instead of hard coding them. Same defaults as previous defaults. * Update examples/talk-llama/talk-llama.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-05-26 07:57:39 +02:00
Sacha Arbonel	78b31ca782	server : Add k6 Load Testing Script (#3175 ) * add load testing script and update README for k6 integration	2025-05-22 10:03:04 +02:00
Georgi Gerganov	6b6cf19c65	talk-llama : sync llama.cpp ggml-ci	2025-05-19 14:58:39 +03:00
Daniel Bevenius	f389d7e3e5	examples : add --print-confidence option to cli (#3150 ) * examples : add --print-confidence option to cli This commit adds a new command-line option `--print-confidence` to the whisper-cli. When enabled, this option prints the confidence level of each token in the transcribed text using ANSI formatting codes. The confidence levels are represented using different styles: ```console main: confidence: highlighted (low confidence), underlined (medium), dim (high confidence) ``` Refs: https://github.com/ggml-org/whisper.cpp/issues/3135	2025-05-14 19:21:48 +02:00
Daniel Bevenius	3882a099e1	server : add --flash-attn usage output (#3152 ) This commit adds the `--flash-attn` option to the usage output of the server example. The motivation for this change is that while it is possible to set this option it is not printed in the usage output.	2025-05-14 15:22:05 +02:00
Georgi Gerganov	f890560575	talk-llama : sync llama.cpp ggml-ci	2025-05-13 13:59:21 +03:00
Daniel Bevenius	fbad8058c4	examples : add VAD speech segments example (#3147 ) This commit adds an example that demonstrates how to use a VAD (Voice Activity Detection) model to segment an audio file into speech segments. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3144	2025-05-13 12:31:00 +02:00
Daniel Bevenius	b2513a6208	vad : remove shortform for --vad option in cli.cpp (#3145 ) This commit removes the shortform for the --vad option in cli.cpp. The motivation for this is that `-v` is often used for verbose or version is many tools and this might cause confusion. Refs: https://github.com/ggml-org/whisper.cpp/pull/3065#issuecomment-2873243334	2025-05-13 06:04:05 +02:00
Tomer Schlesinger	587ea01f55	docs : update README.md for whisper.objc app (#2569 )	2025-05-13 06:03:50 +02:00
Daniel Bevenius	e41bc5c61a	vad : add initial Voice Activity Detection (VAD) support (#3065 ) * vad : add initial Voice Activity Detection (VAD) support This commit add support for Voice Activity Detection (VAD). When enabled this feature will process the audio input and detect speech segments. This information is then used to reduce the number of samples that need to be processed by whisper_full. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3003 --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-05-12 16:10:11 +02:00
Daniel Bevenius	186855e38b	cli : print color scheme info for --print-colors (#3141 ) This commit adds a description of the color scheme used in the CLI when the --print-colors option is enabled. The motivation for this is that it is not immediately clear what the color scheme is when using the CLI with the --print-colors option. Example output: ```console $ ./build/bin/whisper-cli -f samples/jfk.wav --print-colors ... main: color scheme: red (low confidence), yellow (medium), green (high confidence) [00:00:00.000 --> 00:00:11.000] And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country. ``` The description will not be dispayed if the `--no-prints` options is set. Refs: https://github.com/ggml-org/whisper.cpp/issues/3135	2025-05-12 10:43:04 +02:00
Daniel Bevenius	4730950492	examples : update link to Paul Tol's color scheme [no ci] (#3140 ) This commit updates the link to Paul Tol's color scheme in the `examples/common.h` file. The previous link was outdated and pointed to a non-existent page.	2025-05-12 09:02:06 +02:00
Enes Grahovac	5d4390d281	examples : add HEAPU8 to all of the exported runtime methods (#3134 ) This commit adds HEAPU8 to the list of exported methods. The motivation for this commit is that currently this is causing an error on Window systems where HEAPU8 in undefined, which results in the following error message in the web console: main.js:1 Uncaught TypeError: Cannot read properties of undefined (reading 'buffer') at __emval_get_property (main.js:1:1363125) at 003a453a:0xc4a47 at 003a453a:0xc51cd at Object.full_default (eval at craftInvokerFunction (main.js:1:1347011), <anonymous>:9:10) at whisper.cpp/:647:42 danbev originally fixed this for whisper.wasm, stream.wasm, and command.stream, but the issue still exists on the other examples which I patch in this code. Resolves: #3059	2025-05-10 06:44:13 +02:00
Daniel Bevenius	9791647653	wasm : add note about worker.js file generation [no ci] (#3133 ) This commit updates the documentation for the WASM examples to include a note about the generation of the `worker.js` file. As of Emscripten 3.1.58 (April 2024), separate worker.js files are no longer generated and the worker is embedded in the main JS file. The motivation for this change is to inform users about the new behavior of Emscripten and why the `worker.js` file may not be present. Refs: https://github.com/ggml-org/whisper.cpp/issues/3123	2025-05-09 15:42:45 +02:00
Daniel Bevenius	b6f3fa4059	stream.wasm : add HEAPU8 to exported runtime methods (#3130 ) * stream.wasm : add HEAPU8 to exported runtime methods This commit adds HEAPU8 to the list of exported methods for stream.wasm. The motivation for this is that without it HEAPUD8 will be undefined and when its 'buffer' attribute is accessed this will cause error as reported in the referenced issue. Note that to test this make sure that the web browsers caches is cleared first. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3123 * command.wasm : add HEAPU8 to exported runtime methods	2025-05-08 16:58:34 +02:00
Georgi Gerganov	4a512cb153	cli : avoid std::exchange ggml-ci	2025-05-07 15:39:32 +03:00

1 2 3 4 5 ...

548 Commits