docs and GPU memory release

This commit is contained in:
游雁 2023-07-14 19:03:15 +08:00
parent fb180cc86c
commit 579afd6ac8
5 changed files with 16 additions and 9 deletions

View File

@ -14,13 +14,13 @@
[**News**](https://github.com/alibaba-damo-academy/FunASR#whats-new) [**News**](https://github.com/alibaba-damo-academy/FunASR#whats-new)
| [**Highlights**](#highlights) | [**Highlights**](#highlights)
| [**Installation**](#installation) | [**Installation**](#installation)
| [**Usage**](#usage) | [**Quick Start**](#quick-start)
| [**Papers**](https://github.com/alibaba-damo-academy/FunASR#citations) | [**Runtime**](./funasr/runtime/readme.md)
| [**Runtime**](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime) | [**Model Zoo**](./docs/model_zoo/modelscope_models.md)
| [**Model Zoo**](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md)
| [**Contact**](#contact) | [**Contact**](#contact)
| [**M2MET2.0 Challenge**](https://github.com/alibaba-damo-academy/FunASR#multi-channel-multi-party-meeting-transcription-20-m2met20-challenge)
<a name="whats-new"></a>
## What's new: ## What's new:
### FunASR runtime-SDK ### FunASR runtime-SDK
@ -36,11 +36,13 @@ We are pleased to announce that the M2MeT2.0 challenge has been accepted by the
For the release notes, please ref to [news](https://github.com/alibaba-damo-academy/FunASR/releases) For the release notes, please ref to [news](https://github.com/alibaba-damo-academy/FunASR/releases)
<a name="highlights"></a>
## Highlights ## Highlights
- FunASR is a fundamental speech recognition toolkit that offers a variety of features, including speech recognition (ASR), Voice Activity Detection (VAD), Punctuation Restoration, Language Models, Speaker Verification, Speaker diarization and multi-talker ASR. - FunASR is a fundamental speech recognition toolkit that offers a variety of features, including speech recognition (ASR), Voice Activity Detection (VAD), Punctuation Restoration, Language Models, Speaker Verification, Speaker diarization and multi-talker ASR.
- We have released a vast collection of academic and industrial pretrained models on the [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition), which can be accessed through our [Model Zoo](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md). The representative [Paraformer-large](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) model has achieved SOTA performance in many speech recognition tasks. - We have released a vast collection of academic and industrial pretrained models on the [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition), which can be accessed through our [Model Zoo](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md). The representative [Paraformer-large](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) model has achieved SOTA performance in many speech recognition tasks.
- FunASR offers a user-friendly pipeline for fine-tuning pretrained models from the [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition). Additionally, the optimized dataloader in FunASR enables faster training speeds for large-scale datasets. This feature enhances the efficiency of the speech recognition process for researchers and practitioners. - FunASR offers a user-friendly pipeline for fine-tuning pretrained models from the [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition). Additionally, the optimized dataloader in FunASR enables faster training speeds for large-scale datasets. This feature enhances the efficiency of the speech recognition process for researchers and practitioners.
<a name="Installation"></a>
## Installation ## Installation
Install from pip Install from pip
@ -70,7 +72,8 @@ pip3 install -U modelscope
For more details, please ref to [installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/installation.html) For more details, please ref to [installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/installation.html)
## Usage <a name="quick-start"></a>
## Quick Start
You could use FunASR by: You could use FunASR by:
@ -120,6 +123,8 @@ python funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode 2pass --chunk
#python funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode 2pass --chunk_size "8,8,4" --audio_in "./data/wav.scp" --output_dir "./results" #python funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode 2pass --chunk_size "8,8,4" --audio_in "./data/wav.scp" --output_dir "./results"
``` ```
More examples could be found in [docs](https://alibaba-damo-academy.github.io/FunASR/en/runtime/websocket_python.html#id2) More examples could be found in [docs](https://alibaba-damo-academy.github.io/FunASR/en/runtime/websocket_python.html#id2)
<a name="contact"></a>
## Contact ## Contact
If you have any questions about FunASR, please contact us by If you have any questions about FunASR, please contact us by

View File

@ -201,7 +201,7 @@ cd egs/aishell/paraformer
## 许可协议 ## 许可协议
项目遵循[The MIT License](https://opensource.org/licenses/MIT)开源协议. 工业模型许可协议请参考([点击此处](./MODEL_LICENSE) 项目遵循[The MIT License](https://opensource.org/licenses/MIT)开源协议 工业模型许可协议请参考([点击此处](./MODEL_LICENSE)
## Stargazers over time ## Stargazers over time

View File

@ -439,6 +439,7 @@ def inference_paraformer(
logging.info(rtf_avg) logging.info(rtf_avg)
if writer is not None: if writer is not None:
ibest_writer["rtf"]["rtf_avf"] = rtf_avg ibest_writer["rtf"]["rtf_avf"] = rtf_avg
torch.cuda.empty_cache()
return asr_result_list return asr_result_list
return _forward return _forward
@ -730,6 +731,7 @@ def inference_paraformer_vad_punc(
ibest_writer["time_stamp"][key] = "{}".format(time_stamp_postprocessed) ibest_writer["time_stamp"][key] = "{}".format(time_stamp_postprocessed)
logging.info("decoding, utt: {}, predictions: {}".format(key, text_postprocessed_punc)) logging.info("decoding, utt: {}, predictions: {}".format(key, text_postprocessed_punc))
torch.cuda.empty_cache()
return asr_result_list return asr_result_list
return _forward return _forward

View File

@ -123,7 +123,7 @@ def inference_vad(
vad_results.append(item) vad_results.append(item)
if writer is not None: if writer is not None:
ibest_writer["text"][keys[i]] = "{}".format(results[i]) ibest_writer["text"][keys[i]] = "{}".format(results[i])
torch.cuda.empty_cache()
return vad_results return vad_results
return _forward return _forward

View File

@ -1 +1 @@
0.6.9 0.7.0