whisper.cpp

mirror of https://github.com/ggml-org/whisper.cpp.git synced 2025-09-15 13:28:35 +08:00

History

Georgi Gerganov d3aab3efde llama : add gpt-oss (llama/15091) * oai moe * compat with new checkpoint * add attn sink impl * add rope scaling yarn * logits match with latest transformers code * wip chat template * rm trailing space * use ggml_scale_bias * rm redundant is_swa_all * convert interleaved gate_up * graph : fix activation function to match reference (llama/7) * vocab : handle o200k_harmony special tokens * ggml : add attention sinks support (llama/1) * llama : add attn sinks * ggml : add attn sinks * cuda : add attn sinks * vulkan : add support for sinks in softmax remove unnecessary return * ggml : add fused swiglu_oai op (llama/11) * ggml : add fused swiglu_oai op * Update ggml/src/ggml-cpu/ops.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * update CUDA impl * cont : metal impl * add vulkan impl * test-backend-ops : more test cases, clean up * llama : remove unfused impl * remove extra lines --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: slaren <slarengh@gmail.com> * repack mxfp4 upon conversion * clean up a bit * enable thinking * add quick hack to render only some special tokens * fix bf16 conversion * remove vocab hack * webui ok * support chat parsing for gpt-oss * fix webui * direct mapping mxfp4, FINALLY * force using mxfp4 * properly use lazy tensor * ggml : add mxfp4 ggml : use e8m0 conversion instead of powf Co-authored-by: Diego Devesa <slarengh@gmail.com> change kvalues_mxfp4 table to match e2m1 (llama/6) metal : remove quantization for now (not used) cuda : fix disabled CUDA graphs due to ffn moe bias vulkan : add support for mxfp4 cont : add cm2 dequant * ggml : add ggml_add_id (llama/13) * ggml : add ggml_add_id * add cuda impl * llama : add weight support check for add_id * perf opt * add vulkan impl * rename cuda files * add metal impl * allow in-place ggml_add_id * llama : keep biases on CPU with --cpu-moe * llama : fix compile error ggml-ci * cuda : add fallback for __nv_cvt_e8m0_to_bf16raw ggml-ci * cleanup ggml-ci * sycl : fix supports_op for MXFP4 ggml-ci * fix Unknown reasoning format * ggml-cpu : fix AVX build ggml-ci * fix hip build ggml-ci * cuda : add mxfp4 dequantization support for cuBLAS ggml-ci * ggml-cpu : fix mxfp4 fallback definitions for some architectures ggml-ci * cuda : fix version required for __nv_cvt_e8m0_to_bf16raw --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co> Co-authored-by: slaren <slarengh@gmail.com>		2025-08-18 20:30:45 +03:00
..
ggml-alloc.h	ggml : upgrade init_tensor API to return a ggml_status (llama/11854)	2025-03-08 15:13:01 +02:00
ggml-backend.h	vulkan: Add fusion support for RMS_NORM+MUL (llama/14366)	2025-07-01 17:54:53 +03:00
ggml-blas.h	ggml : build backends as libraries (llama/10256)	2024-11-20 21:00:08 +02:00
ggml-cann.h	ggml : build backends as libraries (llama/10256)	2024-11-20 21:00:08 +02:00
ggml-cpp.h	ggml : fix ggml_gallocr_ptr type (ggml/1205)	2025-05-01 13:29:02 +03:00
ggml-cpu.h	ggml : add ggml_set_rows (llama/14274)	2025-07-01 17:54:53 +03:00
ggml-cuda.h	ggml : build backends as libraries (llama/10256)	2024-11-20 21:00:08 +02:00
ggml-metal.h	repo : update links to new url (llama/11886)	2025-02-27 08:55:36 +02:00
ggml-opencl.h	Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (llama/10693)	2024-12-18 12:52:16 +02:00
ggml-opt.h	mnist: fix segmentation fault (ggml/1227)	2025-05-19 14:58:39 +03:00
ggml-rpc.h	rpc : do not wait for response when sending RPC_CMD_SET_TENSOR (llama/12943)	2025-05-01 13:29:02 +03:00
ggml-sycl.h	ggml : build backends as libraries (llama/10256)	2024-11-20 21:00:08 +02:00
ggml-vulkan.h	vulkan: Make Vulkan optional at runtime (ggml/11493). (llama/11494)	2025-02-27 08:55:36 +02:00
ggml-webgpu.h	ggml: Add initial WebGPU backend (llama/14521)	2025-07-20 00:23:50 +03:00
ggml.h	llama : add gpt-oss (llama/15091)	2025-08-18 20:30:45 +03:00
gguf.h	GGUF: C++ refactor, backend support, misc fixes (skip) (llama/11030)	2025-01-14 10:38:01 +02:00