Release 20250625

2025-09-15 15:18:35 +08:00 · 2025-06-25 18:05:47 -07:00 · 2025-06-25 18:03:25 -07:00 · 2025-06-25 18:00:48 -07:00 · 2025-06-25 17:54:30 -07:00 · 2025-06-25 17:42:09 -07:00
19 changed files with 186 additions and 93 deletions
--- a/.github/dependabot.yml
+++ b/.github/dependabot.yml
@ -0,0 +1,13 @@
+# Keep GitHub Actions up to date with GitHub's Dependabot...
+# https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot
+# https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#package-ecosystem
+version: 2
+updates:
+  - package-ecosystem: github-actions
+    directory: /
+    groups:
+      github-actions:
+        patterns:
+          - "*"  # Group all Actions updates into a single larger pull request
+    schedule:
+      interval: weekly
--- a/.github/workflows/python-publish.yml
+++ b/.github/workflows/python-publish.yml
@ -8,23 +8,23 @@ jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
-    - uses: actions/checkout@v3
+    - uses: actions/checkout@v4
    - uses: actions-ecosystem/action-regex-match@v2
      id: regex-match
      with:
        text: ${{ github.event.head_commit.message }}
        regex: '^Release ([^ ]+)'
    - name: Set up Python
-      uses: actions/setup-python@v4
+      uses: actions/setup-python@v5
      with:
-        python-version: '3.8'
+        python-version: '3.12'
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
-        pip install setuptools wheel twine
+        pip install setuptools wheel twine build
    - name: Release
      if: ${{ steps.regex-match.outputs.match != '' }}
-      uses: softprops/action-gh-release@v1
+      uses: softprops/action-gh-release@v2
      with:
        tag_name: v${{ steps.regex-match.outputs.group1 }}
    - name: Build and publish
@ -33,5 +33,5 @@ jobs:
        TWINE_USERNAME: __token__
        TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }}
      run: |
-        python setup.py sdist
+        python -m build --sdist
        twine upload dist/*
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@ -11,19 +11,19 @@ jobs:
  pre-commit:
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
      - name: Fetch base branch
        run: git fetch origin ${{ github.base_ref }}
-      - uses: actions/setup-python@v4
+      - uses: actions/setup-python@v5
        with:
-          python-version: "3.8"
+          python-version: "3.9"
          architecture: x64
      - name: Get pip cache dir
        id: pip-cache
        run: |
          echo "dir=$(pip cache dir)" >> $GITHUB_OUTPUT
      - name: pip/pre-commit cache
-        uses: actions/cache@v3
+        uses: actions/cache@v4
        with:
          path: |
            ${{ steps.pip-cache.outputs.dir }}
@ -33,15 +33,19 @@ jobs:
            ${{ runner.os }}-pip-pre-commit
      - name: pre-commit
        run: |
-          pip install -U pre-commit
+          pip install --upgrade pre-commit
          pre-commit install --install-hooks
          pre-commit run --all-files
  whisper-test:
    needs: pre-commit
    runs-on: ubuntu-latest
    strategy:
+      fail-fast: false
      matrix:
        include:
+          - python-version: '3.8'
+            pytorch-version: 1.10.1
+            numpy-requirement: "'numpy<2'"
          - python-version: '3.8'
            pytorch-version: 1.13.1
            numpy-requirement: "'numpy<2'"
@ -60,10 +64,16 @@ jobs:
          - python-version: '3.12'
            pytorch-version: 2.4.1
            numpy-requirement: "'numpy'"
+          - python-version: '3.12'
+            pytorch-version: 2.5.1
+            numpy-requirement: "'numpy'"
+          - python-version: '3.13'
+            pytorch-version: 2.5.1
+            numpy-requirement: "'numpy'"
    steps:
-      - uses: conda-incubator/setup-miniconda@v2
+      - uses: conda-incubator/setup-miniconda@v3
      - run: conda install -n test ffmpeg python=${{ matrix.python-version }}
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
      - run: echo "$CONDA/envs/test/bin" >> $GITHUB_PATH
      - run: pip3 install .["dev"] ${{ matrix.numpy-requirement }} torch==${{ matrix.pytorch-version }}+cpu --index-url https://download.pytorch.org/whl/cpu --extra-index-url https://pypi.org/simple
      - run: pytest --durations=0 -vv -k 'not test_transcribe or test_transcribe[tiny] or test_transcribe[tiny.en]' -m 'not requires_cuda'
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@ -1,6 +1,6 @@
 repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
-    rev: v4.0.1
+    rev: v5.0.0
    hooks:
      - id: check-json
      - id: end-of-file-fixer
@ -11,17 +11,17 @@ repos:
      - id: check-added-large-files
        args: [--maxkb=4096]
  - repo: https://github.com/psf/black
-    rev: 23.7.0
+    rev: 25.1.0
    hooks:
      - id: black
  - repo: https://github.com/pycqa/isort
-    rev: 5.12.0
+    rev: 6.0.0
    hooks:
      - id: isort
        name: isort (python)
        args: ["--profile", "black", "-l", "88", "--trailing-comma", "--multi-line", "3"]
  - repo: https://github.com/pycqa/flake8.git
-    rev: 6.0.0
+    rev: 7.1.1
    hooks:
      - id: flake8
        types: [python]
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,5 +1,26 @@
 # CHANGELOG

+## [v20250625](https://github.com/openai/whisper/releases/tag/v20250625)
+
+* Fix: Update torch.load to use weights_only=True to prevent security w… ([#2451](https://github.com/openai/whisper/pull/2451))
+* Fix: Ensure DTW cost tensor is on the same device as input tensor ([#2561](https://github.com/openai/whisper/pull/2561))
+* docs: updated README to specify translation model limitation ([#2547](https://github.com/openai/whisper/pull/2547))
+* Fixed triton kernel update to support latest triton versions ([#2588](https://github.com/openai/whisper/pull/2588))
+* Fix: GitHub display errors for Jupyter notebooks ([#2589](https://github.com/openai/whisper/pull/2589))
+* Bump the github-actions group with 3 updates ([#2592](https://github.com/openai/whisper/pull/2592))
+* Keep GitHub Actions up to date with GitHub's Dependabot ([#2486](https://github.com/openai/whisper/pull/2486))
+* pre-commit: Upgrade black v25.1.0 and isort v6.0.0 ([#2514](https://github.com/openai/whisper/pull/2514))
+* GitHub Actions: Add Python 3.13 to the testing ([#2487](https://github.com/openai/whisper/pull/2487))
+* PEP 621: Migrate from setup.py to pyproject.toml ([#2435](https://github.com/openai/whisper/pull/2435))
+* pre-commit autoupdate && pre-commit run --all-files ([#2484](https://github.com/openai/whisper/pull/2484))
+* Upgrade GitHub Actions ([#2430](https://github.com/openai/whisper/pull/2430))
+* Bugfix: Illogical "Avoid computing higher temperatures on no_speech" ([#1903](https://github.com/openai/whisper/pull/1903))
+* Updating README and doc strings to reflect that n_mels can now be 128 ([#2049](https://github.com/openai/whisper/pull/2049))
+* fix typo data/README.md ([#2433](https://github.com/openai/whisper/pull/2433))
+* Update README.md ([#2379](https://github.com/openai/whisper/pull/2379))
+* Add option to carry initial_prompt with the sliding window ([#2343](https://github.com/openai/whisper/pull/2343))
+* more pytorch versions in tests ([#2408](https://github.com/openai/whisper/pull/2408))
+
 ## [v20240930](https://github.com/openai/whisper/releases/tag/v20240930)

 * allowing numpy 2 in tests ([#2362](https://github.com/openai/whisper/pull/2362))
--- a/README.md
+++ b/README.md
@ -77,25 +77,35 @@ Whisper's performance varies widely depending on the language. The figure below

 ![WER breakdown by language](https://github.com/openai/whisper/assets/266841/f4619d66-1058-4005-8f67-a9d811b77c62)

-
-
 ## Command-line usage

 The following command will transcribe speech in audio files, using the `turbo` model:

+```bash
 whisper audio.flac audio.mp3 audio.wav --model turbo
+```

-The default setting (which selects the `small` model) works well for transcribing English. To transcribe an audio file containing non-English speech, you can specify the language using the `--language` option:
+The default setting (which selects the `turbo` model) works well for transcribing English. However, **the `turbo` model is not trained for translation tasks**. If you need to **translate non-English speech into English**, use one of the **multilingual models** (`tiny`, `base`, `small`, `medium`, `large`) instead of `turbo`. 

+For example, to transcribe an audio file containing non-English speech, you can specify the language:
+
+```bash
 whisper japanese.wav --language Japanese
+```

-Adding `--task translate` will translate the speech into English:
+To **translate** speech into English, use:

-    whisper japanese.wav --language Japanese --task translate
+```bash
+whisper japanese.wav --model medium --language Japanese --task translate
+```
+
+> **Note:** The `turbo` model will return the original language even if `--task translate` is specified. Use `medium` or `large` for the best translation results.

 Run the following to view all available options:

+```bash
 whisper --help
+```

 See [tokenizer.py](https://github.com/openai/whisper/blob/main/whisper/tokenizer.py) for the list of all available languages.

@ -126,7 +136,7 @@ audio = whisper.load_audio("audio.mp3")
 audio = whisper.pad_or_trim(audio)

 # make log-Mel spectrogram and move to the same device as the model
-mel = whisper.log_mel_spectrogram(audio).to(model.device)
+mel = whisper.log_mel_spectrogram(audio, n_mels=model.dims.n_mels).to(model.device)

 # detect the spoken language
 _, probs = model.detect_language(mel)
--- a/data/README.md
+++ b/data/README.md
@ -45,7 +45,7 @@ We downloaded the [CHiME-5 dataset](https://spandh.dcs.shef.ac.uk//chime_challen

 ### AMI-IHM, AMI-SDM1

-We preprocessed the [AMI Corpus](https://groups.inf.ed.ac.uk/ami/corpus/overview.shtml) by following the stage 0 ad 2 of the [s5b recipe](https://github.com/kaldi-asr/kaldi/tree/master/egs/ami/s5b).
+We preprocessed the [AMI Corpus](https://groups.inf.ed.ac.uk/ami/corpus/overview.shtml) by following the stage 0 and 2 of the [s5b recipe](https://github.com/kaldi-asr/kaldi/tree/master/egs/ami/s5b).


 ## Long-form English-only datasets
--- a/notebooks/LibriSpeech.ipynb
+++ b/notebooks/LibriSpeech.ipynb
@ -949,7 +949,8 @@
      "style": "IPY_MODEL_039b53f2702c4179af7e0548018d0588",
      "value": " 164/164 [05:08&lt;00:00,  1.86s/it]"
     }
-    }
+    },
+    "state": {}
   }
  }
 },
--- a/notebooks/Multilingual_ASR.ipynb
+++ b/notebooks/Multilingual_ASR.ipynb
@ -4219,7 +4219,8 @@
            "_view_name": "StyleView",
            "description_width": ""
          }
-        }
+        },
+        "state": {}
      }
    }
  },
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,3 +1,50 @@
+[build-system]
+build-backend = "setuptools.build_meta"
+
+requires = [ "setuptools>=61.2" ]
+
+[project]
+name = "openai-whisper"
+description = "Robust Speech Recognition via Large-Scale Weak Supervision"
+readme.content-type = "text/markdown"
+readme.file = "README.md"
+license = { text = "MIT" }
+authors = [ { name = "OpenAI" } ]
+requires-python = ">=3.8"
+classifiers = [
+  "Programming Language :: Python :: 3 :: Only",
+  "Programming Language :: Python :: 3.8",
+  "Programming Language :: Python :: 3.9",
+  "Programming Language :: Python :: 3.10",
+  "Programming Language :: Python :: 3.11",
+  "Programming Language :: Python :: 3.12",
+  "Programming Language :: Python :: 3.13",
+]
+dynamic = [ "version" ]
+dependencies = [
+  "more-itertools",
+  "numba",
+  "numpy",
+  "tiktoken",
+  "torch",
+  "tqdm",
+  "triton>=2; (platform_machine=='x86_64' and sys_platform=='linux') or sys_platform=='linux2'",
+]
+optional-dependencies.dev = [ "black", "flake8", "isort", "pytest", "scipy" ]
+urls = { Homepage = "https://github.com/openai/whisper" }
+scripts.whisper = "whisper.transcribe:cli"
+
+[tool.setuptools]
+py-modules = [ "whisper" ]
+include-package-data = true
+
+[tool.setuptools.dynamic]
+version = { attr = "whisper.version.__version__" }
+
+[tool.setuptools.packages.find]
+exclude = [ "tests*" ]
+namespaces = false
+
 [tool.black]

 [tool.isort]
@ -5,4 +52,3 @@ profile = "black"
 include_trailing_comma = true
 line_length = 88
 multi_line_output = 3
-
--- a/setup.py
+++ b/setup.py
@ -1,42 +0,0 @@
-import platform
-import sys
-from pathlib import Path
-
-import pkg_resources
-from setuptools import find_packages, setup
-
-
-def read_version(fname="whisper/version.py"):
-    exec(compile(open(fname, encoding="utf-8").read(), fname, "exec"))
-    return locals()["__version__"]
-
-
-requirements = []
-if sys.platform.startswith("linux") and platform.machine() == "x86_64":
-    requirements.append("triton>=2.0.0")
-
-setup(
-    name="openai-whisper",
-    py_modules=["whisper"],
-    version=read_version(),
-    description="Robust Speech Recognition via Large-Scale Weak Supervision",
-    long_description=open("README.md", encoding="utf-8").read(),
-    long_description_content_type="text/markdown",
-    readme="README.md",
-    python_requires=">=3.8",
-    author="OpenAI",
-    url="https://github.com/openai/whisper",
-    license="MIT",
-    packages=find_packages(exclude=["tests*"]),
-    install_requires=[
-        str(r)
-        for r in pkg_resources.parse_requirements(
-            Path(__file__).with_name("requirements.txt").open()
-        )
-    ],
-    entry_points={
-        "console_scripts": ["whisper=whisper.transcribe:cli"],
-    },
-    include_package_data=True,
-    extras_require={"dev": ["pytest", "scipy", "black", "flake8", "isort"]},
-)
--- a/whisper/init.py
+++ b/whisper/init.py
@ -147,7 +147,8 @@ def load_model(
    with (
        io.BytesIO(checkpoint_file) if in_memory else open(checkpoint_file, "rb")
    ) as fp:
-        checkpoint = torch.load(fp, map_location=device)
+        kwargs = {"weights_only": True} if torch.__version__ >= "1.13" else {}
+        checkpoint = torch.load(fp, map_location=device, **kwargs)
    del checkpoint_file

    dims = ModelDimensions(**checkpoint["dims"])
--- a/whisper/audio.py
+++ b/whisper/audio.py
@ -122,7 +122,7 @@ def log_mel_spectrogram(
        The path to audio or either a NumPy array or Tensor containing the audio waveform in 16 kHz

    n_mels: int
-        The number of Mel-frequency filters, only 80 is supported
+        The number of Mel-frequency filters, only 80 and 128 are supported

    padding: int
        Number of zero samples to pad to the right
@ -132,7 +132,7 @@ def log_mel_spectrogram(

    Returns
    -------
-    torch.Tensor, shape = (80, n_frames)
+    torch.Tensor, shape = (n_mels, n_frames)
        A Tensor that contains the Mel spectrogram
    """
    if not torch.is_tensor(audio):
--- a/whisper/normalizers/basic.py
+++ b/whisper/normalizers/basic.py
@ -30,15 +30,19 @@ def remove_symbols_and_diacritics(s: str, keep=""):
    and drop any diacritics (category 'Mn' and some manual mappings)
    """
    return "".join(
+        (
            c
            if c in keep
-        else ADDITIONAL_DIACRITICS[c]
+            else (
+                ADDITIONAL_DIACRITICS[c]
                if c in ADDITIONAL_DIACRITICS
-        else ""
+                else (
+                    ""
                    if unicodedata.category(c) == "Mn"
-        else " "
-        if unicodedata.category(c)[0] in "MSP"
-        else c
+                    else " " if unicodedata.category(c)[0] in "MSP" else c
+                )
+            )
+        )
        for c in unicodedata.normalize("NFKD", s)
    )

--- a/whisper/timing.py
+++ b/whisper/timing.py
@ -117,7 +117,7 @@ def dtw_cuda(x, BLOCK_SIZE=1024):
    x_skew = x_skew.T.contiguous()
    cost = torch.ones(N + M + 2, M + 2) * np.inf
    cost[0, 0] = 0
-    cost = cost.cuda()
+    cost = cost.to(x.device)
    trace = torch.zeros_like(cost, dtype=torch.int32)

    dtw_kernel[(1,)](
--- a/whisper/transcribe.py
+++ b/whisper/transcribe.py
@ -46,6 +46,7 @@ def transcribe(
    no_speech_threshold: Optional[float] = 0.6,
    condition_on_previous_text: bool = True,
    initial_prompt: Optional[str] = None,
+    carry_initial_prompt: bool = False,
    word_timestamps: bool = False,
    prepend_punctuations: str = "\"'“¿([{-",
    append_punctuations: str = "\"'.。,，!！?？:：”)]}、",
@ -102,6 +103,11 @@ def transcribe(
        "prompt-engineer" a context for transcription, e.g. custom vocabularies or proper nouns
        to make it more likely to predict those word correctly.

+    carry_initial_prompt: bool
+        If carry_initial_prompt is True, `initial_prompt` is prepended to the prompt of each internal
+        `decode()` call. If there is not enough context space at the start of the prompt, it is
+        left-sliced to make space.
+
    decode_options: dict
        Keyword arguments to construct `DecodingOptions` instances

@ -208,6 +214,8 @@ def transcribe(
            if (
                no_speech_threshold is not None
                and decode_result.no_speech_prob > no_speech_threshold
+                and logprob_threshold is not None
+                and decode_result.avg_logprob < logprob_threshold
            ):
                needs_fallback = False  # silence
            if not needs_fallback:
@ -227,9 +235,11 @@ def transcribe(
    all_segments = []
    prompt_reset_since = 0

+    remaining_prompt_length = model.dims.n_text_ctx // 2 - 1
    if initial_prompt is not None:
        initial_prompt_tokens = tokenizer.encode(" " + initial_prompt.strip())
        all_tokens.extend(initial_prompt_tokens)
+        remaining_prompt_length -= len(initial_prompt_tokens)
    else:
        initial_prompt_tokens = []

@ -275,7 +285,13 @@ def transcribe(
            segment_duration = segment_size * HOP_LENGTH / SAMPLE_RATE
            mel_segment = pad_or_trim(mel_segment, N_FRAMES).to(model.device).to(dtype)

+            if carry_initial_prompt:
+                nignored = max(len(initial_prompt_tokens), prompt_reset_since)
+                remaining_prompt = all_tokens[nignored:][-remaining_prompt_length:]
+                decode_options["prompt"] = initial_prompt_tokens + remaining_prompt
+            else:
                decode_options["prompt"] = all_tokens[prompt_reset_since:]
+
            result: DecodingResult = decode_with_fallback(mel_segment)
            tokens = torch.tensor(result.tokens)

@ -529,6 +545,8 @@ def cli():

    parser.add_argument("--suppress_tokens", type=str, default="-1", help="comma-separated list of token ids to suppress during sampling; '-1' will suppress most special characters except common punctuations")
    parser.add_argument("--initial_prompt", type=str, default=None, help="optional text to provide as a prompt for the first window.")
+    parser.add_argument("--carry_initial_prompt", type=str2bool, default=False, help="if True, prepend initial_prompt to every internal decode() call. May reduce the effectiveness of condition_on_previous_text")
+
    parser.add_argument("--condition_on_previous_text", type=str2bool, default=True, help="if True, provide the previous output of the model as a prompt for the next window; disabling may make the text inconsistent across windows, but the model becomes less prone to getting stuck in a failure loop")
    parser.add_argument("--fp16", type=str2bool, default=True, help="whether to perform inference in fp16; True by default")

--- a/whisper/triton_ops.py
+++ b/whisper/triton_ops.py
@ -60,7 +60,7 @@ def median_kernel(filter_width: int):
        tl.store(y_ptr + offsets, MIDDLE_ROW_HERE, mask=mask)  # noqa: F821

    kernel = triton.JITFunction(kernel.fn)
-    kernel.src = kernel.src.replace(
+    new_kernel = kernel.src.replace(
        "    LOAD_ALL_ROWS_HERE",
        "\n".join(
            [
@ -69,7 +69,8 @@ def median_kernel(filter_width: int):
            ]
        ),
    )
-    kernel.src = kernel.src.replace(
+
+    new_kernel = new_kernel.replace(
        "    BUBBLESORT_HERE",
        "\n\n".join(
            [
@ -90,7 +91,14 @@ def median_kernel(filter_width: int):
            ]
        ),
    )
-    kernel.src = kernel.src.replace("MIDDLE_ROW_HERE", f"row{filter_width // 2}")
+
+    new_kernel = new_kernel.replace("MIDDLE_ROW_HERE", f"row{filter_width // 2}")
+
+    if hasattr(kernel, "_unsafe_update_src") is True:
+        kernel._unsafe_update_src(new_kernel)
+        kernel.hash = None
+    else:
+        kernel.src = new_kernel

    return kernel

--- a/whisper/utils.py
+++ b/whisper/utils.py
@ -209,9 +209,11 @@ class SubtitlesWriter(ResultWriter):

                        yield start, end, "".join(
                            [
+                                (
                                    re.sub(r"^(\s*)(.*)$", r"\1<u>\2</u>", word)
                                    if j == i
                                    else word
+                                )
                                for j, word in enumerate(all_words)
                            ]
                        )
--- a/whisper/version.py
+++ b/whisper/version.py
@ -1 +1 @@
-__version__ = "20240930"
+__version__ = "20250625"
Author	SHA1	Message	Date
Jong Wook Kim	c0d2f624c0	Release 20250625 Some checks failed Release / deploy (push) Has been cancelled Details test / pre-commit (push) Has been cancelled Details test / whisper-test ('numpy', 3.11, 2.3.1) (push) Has been cancelled Details test / whisper-test ('numpy', 3.12, 2.4.1) (push) Has been cancelled Details test / whisper-test ('numpy', 3.12, 2.5.1) (push) Has been cancelled Details test / whisper-test ('numpy', 3.13, 2.5.1) (push) Has been cancelled Details test / whisper-test ('numpy<2', 3.10, 2.2.2) (push) Has been cancelled Details test / whisper-test ('numpy<2', 3.8, 1.10.1) (push) Has been cancelled Details test / whisper-test ('numpy<2', 3.8, 1.13.1) (push) Has been cancelled Details test / whisper-test ('numpy<2', 3.8, 2.0.1) (push) Has been cancelled Details test / whisper-test ('numpy<2', 3.9, 2.1.2) (push) Has been cancelled Details	2025-06-25 18:05:47 -07:00
Jong Wook Kim	db7fbc75fe	Release 20250625	2025-06-25 18:03:25 -07:00
Jong Wook Kim	31243bad24	Release 20250625	2025-06-25 18:00:48 -07:00
Dridi Yassin	1f8fc975d3	Fix: Update torch.load to use weights_only=True to prevent security w… (#2451 ) * Fix: Update torch.load to use weights_only=True to prevent security warning * Update __init__.py * Update __init__.py --------- Co-authored-by: Jong Wook Kim <jongwook@openai.com>	2025-06-25 17:54:30 -07:00
Nathan Harmon	679ae1d141	Fix: Ensure DTW cost tensor is on the same device as input tensor (#2561 ) Co-authored-by: Jong Wook Kim <jongwook@openai.com>	2025-06-25 17:42:09 -07:00
Nicholas Nadeau, Ph.D., P.Eng.	f50c4f264e	docs: updated README to specify translation model limitation (#2547 ) Updated README given info from https://github.com/openai/whisper/discussions/2483	2025-06-25 17:03:47 -07:00
ExtReMLapin	86899243e9	Fixed triton kernel update to support latest triton versions (#2588 ) * Update triton kernel using _unsafe_update_src * support old triton versions * refactored changes to update triton kernel only once * Update triton_ops.py --------- Co-authored-by: Jong Wook Kim <jongwook@openai.com> Co-authored-by: Jong Wook Kim <ilikekjw@gmail.com>	2025-06-25 17:02:54 -07:00
Learpcs	5dff4db81a	Fix: GitHub display errors for Jupyter notebooks (#2589 ) * Update LibriSpeech.ipynb Update LibriSpeech.ipynb * Update Multilingual_ASR.ipynb	2025-06-25 16:55:15 -07:00
dependabot[bot]	dd985ac4b9	Bump the github-actions group with 3 updates (#2592 ) Some checks failed Release / deploy (push) Has been cancelled Details test / pre-commit (push) Has been cancelled Details test / whisper-test ('numpy', 3.11, 2.3.1) (push) Has been cancelled Details test / whisper-test ('numpy', 3.12, 2.4.1) (push) Has been cancelled Details test / whisper-test ('numpy', 3.12, 2.5.1) (push) Has been cancelled Details test / whisper-test ('numpy', 3.13, 2.5.1) (push) Has been cancelled Details test / whisper-test ('numpy<2', 3.10, 2.2.2) (push) Has been cancelled Details test / whisper-test ('numpy<2', 3.8, 1.10.1) (push) Has been cancelled Details test / whisper-test ('numpy<2', 3.8, 1.13.1) (push) Has been cancelled Details test / whisper-test ('numpy<2', 3.8, 2.0.1) (push) Has been cancelled Details test / whisper-test ('numpy<2', 3.9, 2.1.2) (push) Has been cancelled Details Bumps the github-actions group with 3 updates: [actions/checkout](https://github.com/actions/checkout), [actions/setup-python](https://github.com/actions/setup-python) and [softprops/action-gh-release](https://github.com/softprops/action-gh-release). Updates `actions/checkout` from 3 to 4 - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](https://github.com/actions/checkout/compare/v3...v4) Updates `actions/setup-python` from 4 to 5 - [Release notes](https://github.com/actions/setup-python/releases) - [Commits](https://github.com/actions/setup-python/compare/v4...v5) Updates `softprops/action-gh-release` from 1 to 2 - [Release notes](https://github.com/softprops/action-gh-release/releases) - [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md) - [Commits](https://github.com/softprops/action-gh-release/compare/v1...v2) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '4' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions - dependency-name: actions/setup-python dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions - dependency-name: softprops/action-gh-release dependency-version: '2' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-13 11:22:31 -07:00
Christian Clauss	e1e6aa60ff	Keep GitHub Actions up to date with GitHub's Dependabot (#2486 ) Some checks are pending Release / deploy (push) Waiting to run Details test / pre-commit (push) Waiting to run Details test / whisper-test ('numpy', 3.11, 2.3.1) (push) Blocked by required conditions Details test / whisper-test ('numpy', 3.12, 2.4.1) (push) Blocked by required conditions Details test / whisper-test ('numpy', 3.12, 2.5.1) (push) Blocked by required conditions Details test / whisper-test ('numpy', 3.13, 2.5.1) (push) Blocked by required conditions Details test / whisper-test ('numpy<2', 3.10, 2.2.2) (push) Blocked by required conditions Details test / whisper-test ('numpy<2', 3.8, 1.10.1) (push) Blocked by required conditions Details test / whisper-test ('numpy<2', 3.8, 1.13.1) (push) Blocked by required conditions Details test / whisper-test ('numpy<2', 3.8, 2.0.1) (push) Blocked by required conditions Details test / whisper-test ('numpy<2', 3.9, 2.1.2) (push) Blocked by required conditions Details Automates the creation of pull requests like * #2430 * [Keeping your actions up to date with Dependabot](https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot) * [Configuration options for the dependabot.yml file - package-ecosystem](https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#package-ecosystem)	2025-05-13 11:10:43 -07:00
Christian Clauss	e6a5fc0ff0	pre-commit: Upgrade black v25.1.0 and isort v6.0.0 (#2514 )	2025-05-13 09:43:34 -07:00
Christian Clauss	13907bed90	GitHub Actions: Add Python 3.13 to the testing (#2487 ) Some checks are pending Release / deploy (push) Waiting to run Details test / pre-commit (push) Waiting to run Details test / whisper-test ('numpy', 3.11, 2.3.1) (push) Blocked by required conditions Details test / whisper-test ('numpy', 3.12, 2.4.1) (push) Blocked by required conditions Details test / whisper-test ('numpy', 3.12, 2.5.1) (push) Blocked by required conditions Details test / whisper-test ('numpy', 3.13, 2.5.1) (push) Blocked by required conditions Details test / whisper-test ('numpy<2', 3.10, 2.2.2) (push) Blocked by required conditions Details test / whisper-test ('numpy<2', 3.8, 1.10.1) (push) Blocked by required conditions Details test / whisper-test ('numpy<2', 3.8, 1.13.1) (push) Blocked by required conditions Details test / whisper-test ('numpy<2', 3.8, 2.0.1) (push) Blocked by required conditions Details test / whisper-test ('numpy<2', 3.9, 2.1.2) (push) Blocked by required conditions Details * GitHub Actions: Add Python 3.13 to the testing * GitHub Actions: Add Python 3.13 to the testing * numba==0.61.0rc2; python_version=='3.13' * triton>=2; python_version<'3.13' * fail-fast: false * Numba v0.61.0 is released https://github.com/numba/numba/releases * Update pyproject.toml	2025-05-12 21:10:40 -07:00
Jong Wook Kim	517a43ecd1	Update python-publish.yml using `-m build --sdist` instead of `setup.py sdist`	2025-01-04 12:56:16 -08:00
Christian Clauss	dd4d010d2c	PEP 621: Migrate from setup.py to pyproject.toml (#2435 )	2025-01-04 01:38:35 -08:00
Christian Clauss	26a7cacc83	pre-commit autoupdate && pre-commit run --all-files (#2484 ) * pre-commit autoupdate && pre-commit run --all-files * Black formatter needs a current version of Python	2025-01-04 01:02:18 -08:00
Christian Clauss	6c1d8f1ea1	Upgrade GitHub Actions (#2430 )	2025-01-04 00:47:12 -08:00
Purfview	90db0de189	Bugfix: Illogical "Avoid computing higher temperatures on no_speech" (#1903 ) * Bugfix: Illogical "Avoid computing higher temperatures on no_speech" Bugfix for https://github.com/openai/whisper/pull/1279 It's "silence" when decoding has failed due to `compression_ratio_threshold` too, when further down the code it's not "silence" anymore. "Silence" should be only when decoding has failed due to `logprob_threshold`. Like described there: `8bc8860694/whisper/transcribe.py (L421)` And in code there: `8bc8860694/whisper/transcribe.py (L243-L251)` * Fix if "logprob_threshold=None" --------- Co-authored-by: Jong Wook Kim <jongwook@openai.com>	2024-11-30 21:47:01 -08:00
Lowell Vaughn	fc5ded7d90	Updating README and doc strings to reflect that n_mels can now be 128 (#2049 )	2024-11-26 09:37:01 -08:00
f1sh	173ff7dd1d	fix typo data/README.md (#2433 )	2024-11-12 16:35:54 -08:00
BotMaster3000	271445b2f2	Update README.md (#2379 ) Default now uses Turbo instead of Small	2024-11-03 23:00:30 -08:00
kittsil	5979f03701	Add option to carry initial_prompt with the sliding window (#2343 ) * Add option to carry initial_prompt with the sliding window Add an option `carry_initial_prompt = False` to `whisper.transcribe()`. When set to `True`, `initial_prompt` is prepended to each internal `decode()` call's `prompt`. If there is not enough context space at the start of the prompt, the prompt is left-sliced to make space. * Prevent redundant initial_prompt_tokens * Revert unnecessary .gitignore change --------- Co-authored-by: Kittsil <kittsil@gmail.com> Co-authored-by: Jong Wook Kim <jongwook@openai.com>	2024-10-26 07:17:31 -07:00
Jong Wook Kim	cdb8147962	more pytorch versions in tests (#2408 )	2024-10-25 17:30:02 -07:00