Commits · main · SwiftChat / Whisper · GitLab

Snippets Groups Projects

This project is mirrored from https://github.com/openai/whisper. Pull mirroring updated 9 minutes ago.

Jan 04, 2025
- Update python-publish.yml · 517a43ec
  Jong Wook Kim authored 3 months ago
  
  using `-m build --sdist` instead of `setup.py sdist`
  Unverified
  
  517a43ec
- PEP 621: Migrate from setup.py to pyproject.toml (#2435) · dd4d010d
  Christian Clauss authored 3 months ago
  
  Unverified
  
  dd4d010d
- pre-commit autoupdate && pre-commit run --all-files (#2484) · 26a7cacc
  Christian Clauss authored 3 months ago
  
  * pre-commit autoupdate && pre-commit run --all-files * Black formatter needs a current version of Python
  Unverified
  
  26a7cacc
- Upgrade GitHub Actions (#2430) · 6c1d8f1e
  Christian Clauss authored 3 months ago
  
  Unverified
  
  6c1d8f1e
Dec 01, 2024

Bugfix: Illogical "Avoid computing higher temperatures on no_speech" (#1903) · 90db0de1

Purfview authored 4 months ago

* Bugfix: Illogical "Avoid computing higher temperatures on no_speech"

Bugfix for https://github.com/openai/whisper/pull/1279

It's "silence" when decoding has failed due to `compression_ratio_threshold` too, when further down the code it's not "silence" anymore.

"Silence" should be only when decoding has failed due to `logprob_threshold`.

Like described there:
https://github.com/openai/whisper/blob/8bc8860694949db53c42ba47ddc23786c2e02a8b/whisper/transcribe.py#L421

And in code there:
https://github.com/openai/whisper/blob/8bc8860694949db53c42ba47ddc23786c2e02a8b/whisper/transcribe.py#L243-L251



* Fix if "logprob_threshold=None"

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>

90db0de1

Nov 26, 2024
- Updating README and doc strings to reflect that n_mels can now be 128 (#2049) · fc5ded7d
  Lowell Vaughn authored 4 months ago
  
  Unverified
  
  fc5ded7d
Nov 13, 2024
- fix typo data/README.md (#2433) · 173ff7dd
  f1sh authored 5 months ago
  
  Unverified
  
  173ff7dd
Nov 04, 2024
- Update README.md (#2379) · 271445b2
  BotMaster3000 authored 5 months ago
  
  Default now uses Turbo instead of Small
  Unverified
  
  271445b2
Oct 26, 2024

Add option to carry initial_prompt with the sliding window (#2343) · 5979f037

kittsil authored 5 months ago


* Add option to carry initial_prompt with the sliding window

Add an option `carry_initial_prompt = False` to `whisper.transcribe()`.
When set to `True`, `initial_prompt` is prepended to each internal `decode()` call's `prompt`.
If there is not enough context space at the start of the prompt, the prompt is left-sliced to make space.

* Prevent redundant initial_prompt_tokens

* Revert unnecessary .gitignore change

---------

Co-authored-by: Kittsil <kittsil@gmail.com>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>

5979f037

more pytorch versions in tests (#2408) · cdb81479
Jong Wook Kim authored 5 months ago

Unverified

cdb81479

Sep 30, 2024
- Release 20240930 · 25639fc1
  Jong Wook Kim authored 6 months ago
  
  View commits for tag v20240930 v20240930
  
  25639fc1
- allowing numpy 2 in tests (#2362) · 260bbcfc
  Jong Wook Kim authored 6 months ago
  
  * allowing numpy 2 in tests * allowing numpy 2 in tests
  Unverified
  
  260bbcfc
- large-v3-turbo model (#2361) · 25e5c364
  Jong Wook Kim authored 6 months ago
  
  Unverified
  
  25e5c364
- test on python/pytorch versions up to 3.12 and 2.4.1 (#2360) · b66b46f3
  Jong Wook Kim authored 6 months ago
  
  Unverified
  
  b66b46f3
- using sdpa if available (#2359) · 27f97132
  Jong Wook Kim authored 6 months ago
  
  * using sdpa if available * Update model.py
  Unverified
  
  27f97132
Sep 27, 2024
- Release 20240927 · 423492dd
  Jong Wook Kim authored 6 months ago
  
  View commits for tag v20240927 v20240927
  
  423492dd
Sep 10, 2024

pinning numpy<2 in tests (#2332) · 279133e3
Jong Wook Kim authored 7 months ago
```
* pinning numpy<2 in tests

* pip install together

* pip install together
```
Unverified

279133e3

Relax triton requirements for compatibility with pytorch 2.4 and newer (#2307) · 32d55d5d

Jianan Xing authored 7 months ago

* Relax triton requirements for compatibility with pytorch 2.4 and newer

Similar to https://github.com/openai/whisper/pull/1802, but now when pytorch upgrades to 2.4, it requires triton==3.0.0. I am not sure if it makes sense to remove the upper bound version constraints

* Update requirements.txt

32d55d5d

Dec 18, 2023

Skip silence around hallucinations (#1838) · ba3f3cd5

ryanheise authored 1 year ago


* Add clip_timestamps option

* Add hallucination_silence_threshold option

* Fix typing for python < 3.9

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>

ba3f3cd5

Dec 11, 2023
- Fix triton env marker (#1887) · 8bc88606
  Bob Lin authored 1 year ago
  
  Unverified
  
  8bc88606
Nov 17, 2023
- Release 20231117 · e58f2880
  Jong Wook Kim authored 1 year ago
  
  View commits for tag v20231117 v20231117
  
  e58f2880
Nov 13, 2023
- Relax triton requirements for compatibility with pytorch 2.1 and newer (#1802) · 1cea4357
  Eugene Indenbom authored 1 year ago
  
  Unverified
  
  1cea4357
Nov 06, 2023

Release 20231106 · fcfeaf1b
Jong Wook Kim authored 1 year ago

View commits for tag v20231106 v20231106

fcfeaf1b

large-v3 (#1761) · c5d42560

Jong Wook Kim authored 1 year ago

* mel_filters() loads 128 mel bins

* can load 100-language models

* large-v3 checkpoint and evals

* add mandarin alias

* remove unused path

* flake8 fix

* formatting fix

c5d42560

Release 20231105 · f6f01c56
Jong Wook Kim authored 1 year ago

View commits for tag v20231105 v20231105

f6f01c56
remove tiktoken pin (#1759) · 746aaaea
Jong Wook Kim authored 1 year ago

Unverified

746aaaea
docs: Disambiguation of the term "relative speed" in the README (#1751) · b9f17e1f
Philippe Hebert authored 1 year ago
```
* docs: defines relative speed in README

* combined paragraphs

---------

Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
```
Unverified

b9f17e1f

allow_pickle=False while loading of mel matrix IN audio.py (#1511) · 7dfcd563

Mohamad Zamini authored 1 year ago


* Update audio.py

 The `mel_filters` function is using a `np.load` function to load a pre-computed mel filterbank matrix. This function is not thread-safe, which means that if it is called from multiple threads at the same time, it may corrupt the data.

To fix this, you can use the `torch.load` function instead. This function is thread-safe, so it will not corrupt the data if it is called from multiple threads at the same time.

* Update audio.py

updated the docstring

* allow_pickle=False

* newline

---------

Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>

7dfcd563

handling transcribe exceptions. (#1682) · b7d277ac

Marco Zucconelli authored 1 year ago


* handling transcribe() exceptions.

* printing stacktrace

---------

Co-authored-by: invalid <invalid@email.com>
Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>

b7d277ac

Add new option to generate subtitles by a specific number of words (#1729) · 6ed314fe

amosal authored 1 year ago


* ADD parser for new argument --max_words_count

* ADD max_words_count in words_options
ADD warning for max_line_width compatibility

* ADD logic for max_words_count

* rename to max_words_per_line

* make them kwargs

* allow specifying file path by --model

* black formatting

---------

Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>

6ed314fe

Oct 10, 2023
- Fix exception when an audio file with no speech is provided (#1396) · b38a1f20
  Jordi Mas authored 1 year ago
  
  Co-authored-by: Jong Wook Kim <jongwook@openai.com>
  Unverified
  
  b38a1f20
Sep 19, 2023
- Release 20230918 · 0a60fcaa
  Jong Wook Kim authored 1 year ago
  
  View commits for tag v20230918 v20230918
  
  0a60fcaa
Sep 18, 2023
- Update test.yml · 5f957da5
  Jong Wook Kim authored 1 year ago
  
  Unverified
  
  5f957da5
- Add .pre-commit-config.yaml (#1528) · 8b330df0
  Arthur Kim authored 1 year ago
  
  * Add .pre-commit-config.yaml Co-authored-by: arthur <arthur@rtzr.ai> * flake8 E741 --------- Co-authored-by: Jong Wook Kim <jongwook@openai.com>
  Unverified
  
  8b330df0
- fix doc of TextDecoder (#1526) · 21010ef4
  sqhao authored 1 year ago
  
  Signed-off-by: haoshengqiang <haoshengqiang@xiaohongshu.com> Co-authored-by: haoshengqiang <haoshengqiang@xiaohongshu.com>
  Unverified
  
  21010ef4
- Update model-card.md (#1643) · 29b7df62
  Nino Risteski authored 1 year ago
  
  fixed a few typos
  Unverified
  
  29b7df62
Aug 07, 2023
- word timing tweaks (#1559) · e8622f9a
  taylorchu authored 1 year ago
  
  * word timing tweaks * comment on eot * clearer comments
  Unverified
  
  e8622f9a
Jul 06, 2023

Avoid rearranging all caches (#1483) · b91c9076

WangChou Lu authored 1 year ago


* avoid rearranging all kv_caches

* avoid calculating the same kv_cache from cross attn

* Update decoding.py

* linter fix

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>

b91c9076

Jun 29, 2023
- Improve timestamp heuristics. (#1461) · f572f216
  ryanheise authored 1 year ago
  
  * Improve timestamp heuristics. * Track pauses with last_speech_timestamp
  Unverified
  
  f572f216
May 05, 2023
- fix condition_on_previous_text (#1224) · 248b6cb1
  Valentin Berkes authored 1 year ago
  
  prompt_reset_since is set before all_tokens is extended hence does not have the expected effect.
  Unverified
  
  248b6cb1