From 91630e73316c1e04cad46a8bdffa59765822cc45 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E5=8C=97=E5=BF=B5?= Date: Mon, 16 Oct 2023 16:23:32 +0800 Subject: [PATCH] update modelscope model zoo --- docs/model_zoo/modelscope_models.md | 3 ++- docs/model_zoo/modelscope_models_zh.md | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/model_zoo/modelscope_models.md b/docs/model_zoo/modelscope_models.md index 1e15381b6..46b1313fb 100644 --- a/docs/model_zoo/modelscope_models.md +++ b/docs/model_zoo/modelscope_models.md @@ -17,7 +17,8 @@ Here we provided several pretrained models on different datasets. The details of | Model Name | Language | Training Data | Vocab Size | Parameter | Offline/Online | Notes | |:--------------------------------------------------------------------------------------------------------------------------------------------------:|:--------:|:--------------------------------:|:----------:|:---------:|:--------------:|:--------------------------------------------------------------------------------------------------------------------------------| | [Paraformer-large](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) | CN & EN | Alibaba Speech Data (60000hours) | 8404 | 220M | Offline | Duration of input wav <= 20s | -| [Paraformer-large-long](https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) | CN & EN | Alibaba Speech Data (60000hours) | 8404 | 220M | Offline | Which would deal with arbitrary length input wav | +| [Paraformer-large-long](https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) | CN & EN | Alibaba Speech Data (60000hours) | 8404 | 220M | Offline | Which would deal with arbitrary length input wav | +| [Paraformer-large-en-long](https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-en-16k-common-vocab10020/summary) | EN | Alibaba Speech Data (50000hours) | 10020 | 220M | Offline | Which would deal with arbitrary length input wav | | [Paraformer-large-Spk](https://modelscope.cn/models/damo/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn/summary) | CN & EN | Alibaba Speech Data (60000hours) | 8404 | 220M | Offline | Supporting speaker diarizatioin for ASR results based on paraformer-large-long | | [Paraformer-large-contextual](https://www.modelscope.cn/models/damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404/summary) | CN & EN | Alibaba Speech Data (60000hours) | 8404 | 220M | Offline | Which supports the hotword customization based on the incentive enhancement, and improves the recall and precision of hotwords. | | [Paraformer](https://modelscope.cn/models/damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/summary) | CN & EN | Alibaba Speech Data (50000hours) | 8358 | 68M | Offline | Duration of input wav <= 20s | diff --git a/docs/model_zoo/modelscope_models_zh.md b/docs/model_zoo/modelscope_models_zh.md index c21ae97b8..88fa23e76 100644 --- a/docs/model_zoo/modelscope_models_zh.md +++ b/docs/model_zoo/modelscope_models_zh.md @@ -17,7 +17,8 @@ | 模型名字 | 语言 | 训练数据 | 词典大小 | 参数量 | 非实时/实时 | 备注 | |:--------------------------------------------------------------------------------------------------------------------------------------------------:|:--------:|:---------------------:|:-----------------:|:----:|:-------:|:---------------------------| | [Paraformer-large](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) | 中文和英文 | 阿里巴巴语音数据(60000小时) | 8404 | 220M | 非实时 | 输入wav文件持续时间不超过20秒 | -| [Paraformer-large长音频版本](https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) | 中文和英文 | 阿里巴巴语音数据(60000小时) | 8404 | 220M | 非实时 | 能够处理任意长度的输入wav文件 | +| [Paraformer-large长音频版本](https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) | 中文和英文 | 阿里巴巴语音数据(60000小时) | 8404 | 220M | 非实时 | 能够处理任意长度的输入wav文件 | +| [Paraformer-large-en长音频版本](https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-en-16k-common-vocab10020/summary) | 英文 | 阿里巴巴语音数据(50000小时) | 10020 | 220M | 非实时 | 能够处理任意长度的输入wav文件 | | [Paraformer-large-Spk](https://modelscope.cn/models/damo/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn/summary) | 中文和英文 | 阿里巴巴语音数据(60000小时) | 8404 | 220M | 非实时 | 在长音频功能的基础上添加说话人识别功能 | | [Paraformer-large热词](https://www.modelscope.cn/models/damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404/summary) | 中文和英文 | 阿里巴巴语音数据(60000小时) | 8404 | 220M | 非实时 | 基于激励增强的热词定制支持,可以提高热词的召回率和准确率,输入wav文件持续时间不超过20秒 | | [Paraformer](https://modelscope.cn/models/damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/summary) | 中文和英文 | 阿里巴巴语音数据(50000小时) | 8358 | 68M | 离线 | 输入wav文件持续时间不超过20秒 |