Commit Graph

2773 Commits

Author SHA1 Message Date
游雁
4402e95b0f v1.2.7 2025-08-15 15:22:18 +08:00
游雁
f5051c55cd trust_remote_code 2025-08-15 15:10:37 +08:00
majic31
5115a066c9
fix #2587: Resolve VAD multithreading issue (#2613)
* Fix crash in ASR tasks when lm is set to none in #2237

* fix #2587: Resolve VAD multithreading issue

* Update funasr/models/fsmn_vad_streaming/model.py

ok

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-08-14 16:08:44 +08:00
ming030890
b3fb4c0acd
Allow one to set a custom progress callback (#2609)
* Allow one to set a custom progress callback

so that they can show it own progrss bar

* Uncomment an existing test

* restore indentation

---------

Co-authored-by: Tony Mak <tony@Tonys-MacBook-Air-1802.local>
2025-08-05 17:48:10 +08:00
ming030890
a750595594
Fix a few issues found during fine-tuning (#2582)
* Fix wandb log

* fix validation loss is not logged

batch_idx got reset for each epoch.
use the global step counter instead

* LR should only be updated per step, not per step+ per epoch

* add early stopping

* Fix bf16 handling

scaler is only needed for fp16

* more logs

---------

Co-authored-by: Tony Mak <tony@Tonys-MacBook-Air-1800.local>
2025-07-04 14:25:54 +08:00
kmn1024
443bc09c11
Bugfix: Only allow rank==0 to clean up old checkpoints (#2558)
Fixes bug: https://github.com/modelscope/FunASR/issues/2557
2025-06-25 16:34:30 +08:00
yuGAN6
3445cd9652
sensevoice2jsonl.py punctuation matching fix (#2533)
* fix sensevoice2jsonl.py punctuation check

* fix sensevoice2jsonl.py punc check
2025-05-28 10:33:26 +08:00
chengligen
8b0fb74bde
feat: add 'words' key aligned with timestamps in sensevoice model output (#2531) 2025-05-26 14:11:33 +08:00
王梦迪
561bdbdfc0
通过缓存seg_dict,加快seaco_paraformer推理 (#2520)
Co-authored-by: wangmengdi06 <wangmengdi06@58.com>
2025-05-22 11:27:01 +08:00
xmx0632
e7237d8cb4
add mac m1 mps support (#2477) 2025-04-14 13:40:12 +08:00
AldarisX
d43d0853dc
add intel xpu support (#2468) 2025-04-07 21:20:31 +08:00
yijinsheng
9afa40520f
本地模型加载 (#2453) 2025-04-07 00:29:17 +08:00
Isuxiz Slidder
3df109adfc
Update model.py to fix "IndexError: index 1 is out of bounds for dimension 1 with size 0" (#2454)
* Update model.py

Avoid exception of "IndexError: index 1 is out of bounds for dimension 1 with size 0"

* Update model.py

Add return word in timestamps

* Revert "Update model.py"

This reverts commit bc736df302.
2025-03-31 17:51:52 +08:00
天地
e24dbdc496
感觉应该从文件读取更合适,因为上面判断了文件存在,且可以读取,如果本身是文本的话,下面也会有逻辑进行处理 (#2452)
Co-authored-by: tiandiweizun <qq1274949542@163.com>
2025-03-26 13:44:41 +08:00
passerbya
5ee2f382b3
FIX 'NoneType' object has no attribute 'isalpha' (#2440)
Traceback (most recent call last):
  File "/root/miniconda3/envs/sensevoice/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/root/miniconda3/envs/sensevoice/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/data2/workspace/egs_vocal_extractor/data/speech_det.py", line 156, in process_audio_task
    res = model.generate(
  File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 306, in generate
    return self.inference_with_vad(input, input_len=input_len, **cfg)
  File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 464, in inference_with_vad
    results = self.inference(
  File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 345, in inference
    res = model.inference(**batch, **kwargs)
  File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/models/sense_voice/model.py", line 950, in inference
    timestamp = self.post(timestamp)
  File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/models/sense_voice/model.py", line 973, in post
    elif prev_word.isalpha() and prev_word.isascii() and word.isalpha() and word.isascii():
AttributeError: 'NoneType' object has no attribute 'isalpha'
2025-03-20 23:01:05 +08:00
天地
6e69d784e4
1. bug fix:list(mean)和list(var),由于mean和var是numpy,导致写入到文件的格式错误,参考上面的话,大概率是list(mean.tolist()),其实外层list没有必要 (#2437)
2. 删除不必要的代码list(numpy_array.tolist())-->numpy_array.tolist()
3. 性能优化:replace没有必要,性能慢,性能为O(nm),n是源字符串长度,m是需要替换的字符串长度,虽然这里的m长度是1,且list转字符串的"[]",只有首尾有,直接拼接即可。

Co-authored-by: tiandiweizun <qq1274949542@163.com>
2025-03-19 23:10:13 +08:00
Han Zhang
3c349ac053
fix: use converted token_ids for alignment for sensevoice model with timestamp output (#2429)
* fix: use converted token_ids for alignment

BPE doesn't guarantee converted ids (subwords) are revertible. which means `tokens` converted back is not always the same as `token_int`. A easy fix is to directly use the converted ids for alignment. Since they are from the same text, it shouldn't matter.

* fix: handle empty string

to index an empty string is to raise an exception. 这里没有判空。
2025-03-18 11:45:37 +08:00
游雁
93c701bab6 v1.2.6 2025-03-11 14:26:35 +08:00
Shi Xian
700cb827f5
Revert "# 增强说话人分离功能的时间戳支持" 2025-03-11 13:54:23 +08:00
hohaiuhsx
6fe10a8dbf
修复 当选用SenseVoice模型处理长音频(同时开启vad和output_timestamp)时的异常 (#2413) 2025-03-10 23:16:22 +08:00
游雁
9c67d9b969 v1.2.5 2025-03-07 23:41:54 +08:00
msgk
a8591060d3 fix(spk): 修复 speaker embedding 集群后的重新排序问题
- 增加了对时间戳支持的检查
- 初始化 punc_res 变量以处理不同情况
- 根据模型情况设置 punc_res,包括内部标点模型、外部标点模型和仅时间戳情况
- 修复了缺少标点模型时的错误处理
2025-02-14 14:16:51 +08:00
游雁
53ac0cb401 v1.2.4 2025-02-13 14:16:05 +08:00
游雁
604ae30fdb oom fix 2025-02-13 14:06:03 +08:00
游雁
001a66bbfe oom fix 2025-02-11 10:08:19 +08:00
BienBoy
6ebf6e48eb
fix: resolve CPU runtime error introduced by previous commit (c1e365f) (#2375)
Fixed a bug that caused a runtime error when running the model on CPU, which was introduced in commit c1e365fea0. The error was related to incorrect handling of device placement.
2025-02-05 17:47:20 +08:00
BienBoy
c1e365fea0
fix: resolve unexpected 'out of memory' issue in multi-GPU setup (#2373)
Fixed a bug where calling torch.cuda.empty_cache() caused extra memory usage on 'cuda:0', leading to unexpected 'out of memory' errors in multi-GPU environments.

Reference:
- https://github.com/pytorch/pytorch/issues/25752
- https://github.com/pytorch/pytorch/issues/144025
2025-02-01 23:29:34 +08:00
游雁
c4e7014492 v1.2.3 2025-01-24 16:59:23 +08:00
游雁
23c6d67288 emotion2vec 2025-01-16 11:25:36 +08:00
takipipo
3530688e0a
Make Emotion2vec support onnx (#2359)
* Make emotion2vec exportable to onnx

* Make export_meta of emotion2vec consistence with other models

* Include layer norm in the exported onnx model
2025-01-16 10:33:23 +08:00
游雁
d4f13c2e44 step_or_epoch bugfix 2025-01-10 10:16:11 +08:00
游雁
e6fe602db3 step_or_epoch bugfix 2025-01-10 10:14:30 +08:00
maliubiao
172a3152b4
允许 model.generate 使用bytes io, 以便不写入文件,节省io时间 (#2343) 2024-12-29 22:33:22 +08:00
游雁
a3a1c55c4c v1.2.2 2024-12-25 17:27:10 +08:00
zhifu gao
3f8294b9d7
Revert "shfit to shift (#2266)" (#2336)
This reverts commit 1367973f98.
2024-12-25 17:16:11 +08:00
Zhanzhao (Deo) Liang
8c7b7e5feb
fix export_meta import of sense voice (#2334) 2024-12-25 16:40:29 +08:00
Rin Arakaki
1367973f98
shfit to shift (#2266) 2024-12-24 17:51:31 +08:00
Zhiming Wang
d2cd95bd67
utils.install_model_requirements: support installing with uv (#2329)
When using the uv[1] package manager, pip commands need to be proxied through
uv's pip compatible interface[2]. Calling pip directly causes a
FileNotFoundError.

[1] https://docs.astral.sh/uv/
[2] https://docs.astral.sh/uv/pip/packages/
2024-12-24 09:59:37 +08:00
游雁
d32e112894 bug fix 2024-12-23 21:24:55 +08:00
游雁
1e5ef6ed9a bug fix 2024-12-23 19:06:50 +08:00
zhong zhuang
fcb2102a60
Fix seaco onnx export bug (#2325) 2024-12-21 17:14:35 +08:00
Kun Zou
b5ad7c81be
Support eparaformer model on aishell1 recipe (#2327) 2024-12-21 17:13:46 +08:00
游雁
fdafd3f6bc emotion2vec 2024-12-17 11:15:53 +08:00
游雁
2139ef696b v1.2.0 2024-12-12 11:37:59 +08:00
游雁
5f48457cf1 v1.1.18 2024-12-12 11:37:23 +08:00
游雁
41785b1daf v1.1.18 2024-12-12 11:35:27 +08:00
游雁
bb0017a686 bugfix 2024-12-12 11:35:06 +08:00
游雁
0f3d2d1266 v1.1.17 2024-12-11 14:21:57 +08:00
游雁
92586a4a90 fix bytes 2024-12-10 17:43:58 +08:00
shixian
026b8e3fdc update sensevoice small with timestamp 2024-12-05 19:29:19 +08:00