mirror of
https://github.com/modelscope/FunASR
synced 2025-09-15 14:48:36 +08:00
delete speech_fsmn_vad_zh-cn-16k-common-pytorch
This commit is contained in:
parent
91027ddab4
commit
ff8fdd4acf
@ -1,24 +0,0 @@
|
||||
# ModelScope Model
|
||||
|
||||
## How to finetune and infer using a pretrained ModelScope Model
|
||||
|
||||
### Inference
|
||||
|
||||
Or you can use the finetuned model for inference directly.
|
||||
|
||||
- Setting parameters in `infer.py`
|
||||
- <strong>audio_in:</strong> # support wav, url, bytes, and parsed audio format.
|
||||
- <strong>output_dir:</strong> # If the input format is wav.scp, it needs to be set.
|
||||
|
||||
- Then you can run the pipeline to infer with:
|
||||
```python
|
||||
python infer.py
|
||||
```
|
||||
|
||||
|
||||
Modify inference related parameters in vad.yaml.
|
||||
|
||||
- max_end_silence_time: The end-point silence duration to judge the end of sentence, the parameter range is 500ms~6000ms, and the default value is 800ms
|
||||
- speech_noise_thres: The balance of speech and silence scores, the parameter range is (-1,1)
|
||||
- The value tends to -1, the greater probability of noise being judged as speech
|
||||
- The value tends to 1, the greater probability of speech being judged as noise
|
||||
@ -1,15 +0,0 @@
|
||||
from modelscope.pipelines import pipeline
|
||||
from modelscope.utils.constant import Tasks
|
||||
|
||||
if __name__ == '__main__':
|
||||
audio_in = 'https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav'
|
||||
output_dir = None
|
||||
inference_pipline = pipeline(
|
||||
task=Tasks.auto_speech_recognition,
|
||||
model="damo/speech_fsmn_vad_zh-cn-16k-common-pytorch",
|
||||
model_revision=None,
|
||||
output_dir=output_dir,
|
||||
batch_size=1,
|
||||
)
|
||||
segments_result = inference_pipline(audio_in=audio_in)
|
||||
print(segments_result)
|
||||
Loading…
Reference in New Issue
Block a user