funasr1.0.4

This commit is contained in:
游雁 2024-01-30 14:47:26 +08:00
parent c47ad73e46
commit f1c1cb0773
5 changed files with 37 additions and 75 deletions

View File

@ -3,11 +3,10 @@
([简体中文](./README_zh.md)|English)
# FunASR: A Fundamental End-to-End Speech Recognition Toolkit
<p align="left">
<a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-brightgreen.svg"></a>
<a href=""><img src="https://img.shields.io/badge/Python->=3.7,<=3.10-aff.svg"></a>
<a href=""><img src="https://img.shields.io/badge/Pytorch-%3E%3D1.11-blue"></a>
</p>
[![PyPI](https://img.shields.io/pypi/v/funasr)](https://pypi.org/project/funasr/)
<strong>FunASR</strong> hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model, researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun
@ -28,6 +27,7 @@
<a name="whats-new"></a>
## What's new:
- 2024/01/30funasr-1.0 has been released ([docs](https://github.com/alibaba-damo-academy/FunASR/discussions/1319))
- 2024/01/25: Offline File Transcription Service 4.2, Offline File Transcription Service of English 1.3 releasedoptimized the VAD (Voice Activity Detection) data processing method, significantly reducing peak memory usage, memory leak optimization; Real-time Transcription Service 1.7 releasedoptimizatized the client-side([docs](runtime/readme.md))
- 2024/01/09: The Funasr SDK for Windows version 2.0 has been released, featuring support for The offline file transcription service (CPU) of Mandarin 4.1, The offline file transcription service (CPU) of English 1.2, The real-time transcription service (CPU) of Mandarin 1.6. For more details, please refer to the official documentation or release notes([FunASR-Runtime-Windows](https://www.modelscope.cn/models/damo/funasr-runtime-win-cpu-x64/summary))
- 2024/01/03: File Transcription Service 4.0 released, Added support for 8k models, optimized timestamp mismatch issues and added sentence-level timestamps, improved the effectiveness of English word FST hotwords, supported automated configuration of thread parameters, and fixed known crash issues as well as memory leak problems, refer to ([docs](runtime/readme.md#file-transcription-service-mandarin-cpu)).
@ -48,7 +48,19 @@
<a name="Installation"></a>
## Installation
Please ref to [installation docs](https://alibaba-damo-academy.github.io/FunASR/en/installation/installation.html)
```shell
pip3 install -U funasr
```
Or install from source code
``` sh
git clone https://github.com/alibaba/FunASR.git && cd FunASR
pip3 install -e ./
```
Install modelscope for the pretrained models (Optional)
```shell
pip3 install -U modelscope
```
## Model Zoo
FunASR has open-sourced a large number of pre-trained models on industrial data. You are free to use, copy, modify, and share FunASR models under the [Model License Agreement](./MODEL_LICENSE). Below are some representative models, for more models please refer to the [Model Zoo]().

View File

@ -3,11 +3,9 @@
(简体中文|[English](./README.md))
# FunASR: A Fundamental End-to-End Speech Recognition Toolkit
<p align="left">
<a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-brightgreen.svg"></a>
<a href=""><img src="https://img.shields.io/badge/Python->=3.7,<=3.10-aff.svg"></a>
<a href=""><img src="https://img.shields.io/badge/Pytorch-%3E%3D1.11-blue"></a>
</p>
[![PyPI](https://img.shields.io/pypi/v/funasr)](https://pypi.org/project/funasr/)
FunASR希望在语音识别的学术研究和工业应用之间架起一座桥梁。通过发布工业级语音识别模型的训练和微调研究人员和开发人员可以更方便地进行语音识别模型的研究和生产并推动语音识别生态的发展。让语音识别更有趣
@ -31,6 +29,7 @@ FunASR希望在语音识别的学术研究和工业应用之间架起一座桥
<a name="最新动态"></a>
## 最新动态
- 2024/01/30funasr-1.0发布,更新说明[文档](https://github.com/alibaba-damo-academy/FunASR/discussions/1319)
- 2024/01/25: 中文离线文件转写服务 4.2、英文离线文件转写服务 1.3优化vad数据处理方式大幅降低峰值内存占用内存泄漏优化中文实时语音听写服务 1.7 发布,客户端优化;详细信息参阅([部署文档](runtime/readme_cn.md))
- 2024/01/09: funasr社区软件包windows 2.0版本发布支持软件包中文离线文件转写4.1、英文离线文件转写1.2、中文实时听写服务1.6的最新功能,详细信息参阅([FunASR社区软件包windows版本](https://www.modelscope.cn/models/damo/funasr-runtime-win-cpu-x64/summary))
- 2024/01/03: 中文离线文件转写服务 4.0 发布新增支持8k模型、优化时间戳不匹配问题及增加句子级别时间戳、优化英文单词fst热词效果、支持自动化配置线程参数同时修复已知的crash问题及内存泄漏问题详细信息参阅([部署文档](runtime/readme_cn.md#中文离线文件转写服务cpu版本))
@ -49,7 +48,20 @@ FunASR希望在语音识别的学术研究和工业应用之间架起一座桥
<a name="安装教程"></a>
## 安装教程
FunASR安装教程请阅读[Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/installation.html)
```shell
pip3 install -U funasr
```
或者从源代码安装
``` sh
git clone https://github.com/alibaba/FunASR.git && cd FunASR
pip3 install -e ./
```
如果需要使用工业预训练模型安装modelscope可选
```shell
pip3 install -U modelscope
```
## 模型仓库

View File

@ -1 +1 @@
1.0.3
1.0.4

View File

@ -132,33 +132,3 @@ python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --au
For more examples, please refer to [docs](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/SDK_advanced_guide_offline.md)
## Industrial Model Egs
If you want to use the pre-trained industrial models in ModelScope for inference or fine-tuning training, you can refer to the following command:
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
inference_pipeline = pipeline(
task=Tasks.auto_speech_recognition,
model='damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
)
rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
print(rec_result)
# {'text': '欢迎大家来体验达摩院推出的语音识别模型'}
```
More examples could be found in [docs](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html)
## Academic model egs
If you want to train from scratch, usually for academic models, you can start training and inference with the following command:
```shell
cd egs/aishell/paraformer
. ./run.sh --CUDA_VISIBLE_DEVICES="0,1" --gpu_num=2
```
More examples could be found in [docs](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html)

View File

@ -130,35 +130,3 @@ python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --au
### 工业模型egs
如果您希望使用ModelScope中预训练好的工业模型进行推理或者微调训练您可以参考下面指令
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
inference_pipeline = pipeline(
task=Tasks.auto_speech_recognition,
model='damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
)
rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
print(rec_result)
# {'text': '欢迎大家来体验达摩院推出的语音识别模型'}
```
更多例子可以参考([点击此处](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html)
### 学术模型egs
如果您希望从头开始训练,通常为学术模型,您可以通过下面的指令启动训练与推理:
```shell
cd egs/aishell/paraformer
. ./run.sh --CUDA_VISIBLE_DEVICES="0,1" --gpu_num=2
```
更多例子可以参考([点击此处](https://alibaba-damo-academy.github.io/FunASR/en/academic_recipe/asr_recipe.html)