mirror of
https://github.com/modelscope/FunASR
synced 2025-09-15 14:48:36 +08:00
docs
This commit is contained in:
parent
7ca311f847
commit
4e0aae556b
13
README.md
13
README.md
@ -97,19 +97,18 @@ This project is licensed under the [The MIT License](https://opensource.org/lice
|
||||
## Citations
|
||||
|
||||
``` bibtex
|
||||
@inproceedings{gao2020universal,
|
||||
title={Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model},
|
||||
author={Gao, Zhifu and Zhang, Shiliang and Lei, Ming and McLoughlin, Ian},
|
||||
booktitle={arXiv preprint arXiv:2010.14099},
|
||||
year={2020}
|
||||
}
|
||||
|
||||
@inproceedings{gao2022paraformer,
|
||||
title={Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition},
|
||||
author={Gao, Zhifu and Zhang, Shiliang and McLoughlin, Ian and Yan, Zhijie},
|
||||
booktitle={INTERSPEECH},
|
||||
year={2022}
|
||||
}
|
||||
@inproceedings{gao2020universal,
|
||||
title={Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model},
|
||||
author={Gao, Zhifu and Zhang, Shiliang and Lei, Ming and McLoughlin, Ian},
|
||||
booktitle={arXiv preprint arXiv:2010.14099},
|
||||
year={2020}
|
||||
}
|
||||
@inproceedings{Shi2023AchievingTP,
|
||||
title={Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model},
|
||||
author={Xian Shi and Yanni Chen and Shiliang Zhang and Zhijie Yan},
|
||||
|
||||
@ -60,6 +60,6 @@ sudo docker exec -it funasr bash
|
||||
```shell
|
||||
exit
|
||||
sudo docker ps
|
||||
sudo docker stop <container-id>
|
||||
sudo docker stop funasr
|
||||
```
|
||||
|
||||
|
||||
@ -21,9 +21,10 @@ FunASR hopes to build a bridge between academic research and industrial applicat
|
||||
:caption: Recipe
|
||||
|
||||
./recipe/asr_recipe.md
|
||||
./recipe/sv_recipe.md
|
||||
./recipe/punc_recipe.md
|
||||
./recipe/vad_recipe.md
|
||||
./recipe/sv_recipe.md
|
||||
./recipe/sd_recipe.md
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
@ -50,6 +51,12 @@ FunASR hopes to build a bridge between academic research and industrial applicat
|
||||
./modescope_pipeline/sv_pipeline.md
|
||||
./modescope_pipeline/sd_pipeline.md
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:caption: Huggingface pipeline
|
||||
|
||||
Undo
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:caption: Runtime
|
||||
|
||||
@ -53,7 +53,7 @@ inference_pipeline = pipeline(
|
||||
rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
|
||||
print(rec_result)
|
||||
```
|
||||
The decoding mode of `fast` and `normal`
|
||||
The decoding mode of `fast` and `normal` is fake streaming, which could be used for evaluating of recognition accuracy.
|
||||
Full code of demo, please ref to [demo](https://github.com/alibaba-damo-academy/FunASR/discussions/151)
|
||||
#### [RNN-T-online model]()
|
||||
Undo
|
||||
|
||||
@ -45,7 +45,7 @@ Full code of demo, please ref to [demo](https://github.com/alibaba-damo-academy/
|
||||
|
||||
#### API-reference
|
||||
##### Define pipeline
|
||||
- `task`: `Tasks.auto_speech_recognition`
|
||||
- `task`: `Tasks.voice_activity_detection`
|
||||
- `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
|
||||
- `ngpu`: `1` (Defalut), decoding on GPU. If ngpu=0, decoding on CPU
|
||||
- `ncpu`: `1` (Defalut), sets the number of threads used for intraop parallelism on CPU
|
||||
@ -67,7 +67,7 @@ Full code of demo, please ref to [demo](https://github.com/alibaba-damo-academy/
|
||||
- `output_dir`: None (Defalut), the output path of results if set
|
||||
|
||||
### Inference with multi-thread CPUs or multi GPUs
|
||||
FunASR also offer recipes [infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr/TEMPLATE/infer.sh) to decode with multi-thread CPUs, or multi GPUs.
|
||||
FunASR also offer recipes [infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/vad/TEMPLATE/infer.sh) to decode with multi-thread CPUs, or multi GPUs.
|
||||
|
||||
- Setting parameters in `infer.sh`
|
||||
- `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
|
||||
|
||||
Loading…
Reference in New Issue
Block a user