mirror of
https://github.com/ggml-org/whisper.cpp.git
synced 2025-09-15 13:28:35 +08:00
Using a Phase Vocoder for speeding up the audio tempo by scaling down the frequencies in the frequency domain. This reduces the computation in the Encoder by a factor of 2. The transcription accuracy is degraded, but for slow to normal speech - it seems to be still very good. I think this can find application for real-time transcription - i.e. the "stream" example. |
||
|---|---|---|
| .. | ||
| bench | ||
| main | ||
| stream | ||
| whisper.nvim | ||
| whisper.objc | ||
| whisper.wasm | ||
| CMakeLists.txt | ||
| dr_wav.h | ||
| generate-karaoke.sh | ||