mirror of
https://github.com/ggml-org/whisper.cpp.git
synced 2025-09-15 13:28:35 +08:00
* whisper : use flash attention in the encoder * whisper : add kv_pad * whisper : remove extra backend instance (huh?) * whisper : use FA for cross-attention * whisper : use FA for self-attention * whisper : simplify encoder FA * whisper : add flash_attn runtime parameter * scripts : add bench log * scripts : add M1 Pro bench log |
||
|---|---|---|
| .. | ||
| bench-all-gg.txt | ||
| bench-all.sh | ||
| bench-wts.sh | ||
| bench.py | ||
| convert-all.sh | ||
| deploy-wasm.sh | ||
| gen-authors.sh | ||
| quantize-all.sh | ||
| sha-all.sh | ||
| sync-ggml-am.sh | ||
| sync-ggml.last | ||
| sync-ggml.sh | ||
| sync-llama.sh | ||