diff --git a/docs/installation/docker.md b/docs/installation/docker.md index 24b1532c7..8de339066 100644 --- a/docs/installation/docker.md +++ b/docs/installation/docker.md @@ -1,3 +1,5 @@ +([简体中文](./docker_zh.md)|English) + # Docker ## Install Docker diff --git a/docs/installation/docker_zh.md b/docs/installation/docker_zh.md new file mode 100644 index 000000000..74f303e9c --- /dev/null +++ b/docs/installation/docker_zh.md @@ -0,0 +1,72 @@ +(简体中文|[English](./docker.md)) + +# Docker + +## 安装Docker + +### Ubuntu +```shell +curl -fsSL https://test.docker.com -o test-docker.sh +sudo sh test-docker.sh +``` +### Debian +```shell + curl -fsSL https://get.docker.com -o get-docker.sh + sudo sh get-docker.sh +``` + +### CentOS +```shell +curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun +``` + +### MacOS +```shell +brew install --cask --appdir=/Applications docker +``` + +### Windows +请参考[文档](https://docs.docker.com/desktop/install/windows-install/) + +## 启动Docker +```shell +sudo systemctl start docker +``` +## 下载Docker镜像 + +### 镜像仓库 + +#### CPU +`registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.1.0` + +#### GPU + +`registry.cn-beijing.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.7.0` + +### 拉取镜像 +```shell +sudo docker pull : +``` + +### 查看镜像 +```shell +sudo docker images +``` + +## 运行Docker +```shell +# cpu +sudo docker run -itd --name funasr -v : /bin/bash +# gpu +sudo docker run -itd --gpus all --name funasr -v : /bin/bash + +sudo docker exec -it funasr /bin/bash +``` + +## 停止Docker +```shell +exit +sudo docker ps +sudo docker stop funasr +``` + diff --git a/docs/installation/installation.md b/docs/installation/installation.md index f81ae8315..43856cdfa 100755 --- a/docs/installation/installation.md +++ b/docs/installation/installation.md @@ -1,3 +1,5 @@ +([简体中文](./installation_zh.md)|English) +

@@ -13,7 +15,7 @@ wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh sh Miniconda3-latest-Linux-x86_64.sh source ~/.bashrc -conda create -n funasr python=3.7 +conda create -n funasr python=3.8 conda activate funasr ``` #### Mac @@ -60,7 +62,7 @@ If you want to use the pretrained models in ModelScope, you should install the m ```shell pip3 install -U modelscope # For the users in China, you could install with the command: -# pip3 install -U modelscope -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html -i https://mirror.sjtu.edu.cn/pypi/web/simple +# pip3 install -U modelscope -i https://mirror.sjtu.edu.cn/pypi/web/simple ``` ### FQA diff --git a/docs/installation/installation_zh.md b/docs/installation/installation_zh.md new file mode 100755 index 000000000..f66314811 --- /dev/null +++ b/docs/installation/installation_zh.md @@ -0,0 +1,75 @@ +(简体中文|[English](./installation.md)) + +

+ + + +

+ +## 安装 + +### 安装Conda(可选): + +#### Linux +```sh +wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh +sh Miniconda3-latest-Linux-x86_64.sh +source ~/.bashrc +conda create -n funasr python=3.8 +conda activate funasr +``` +#### Mac +```sh +wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh +# For M1 chip +# wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh +sh Miniconda3-latest-MacOSX* +source ~/.zashrc +conda create -n funasr python=3.8 +conda activate funasr +``` +#### Windows +Ref to [docs](https://docs.conda.io/en/latest/miniconda.html#windows-installers) + +### 安装Pytorch(版本 >= 1.11.0): + +```sh +pip3 install torch torchaudio +``` +如果您的环境中存在CUDAs,则应安装与CUDA匹配版本的pytorch,匹配列表可在文档中找到([文档](https://pytorch.org/get-started/previous-versions/))。 +### 安装funasr + +#### 从pip安装 + +```shell +pip3 install -U funasr +# 对于中国大陆用户,可以使用以下命令进行安装: +# pip3 install -U funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple +``` + +#### 或者从源代码安装 + +``` sh +git clone https://github.com/alibaba/FunASR.git && cd FunASR +pip3 install -e ./ +# 对于中国大陆用户,可以使用以下命令进行安装: +# pip3 install -e ./ -i https://mirror.sjtu.edu.cn/pypi/web/simple +``` + +### 安装modelscope(可选) + +如果您想要使用ModelScope中的预训练模型,则应安装modelscope: + +```shell +pip3 install -U modelscope +# 对于中国大陆用户,可以使用以下命令进行安装: +# pip3 install -U modelscope -i https://mirror.sjtu.edu.cn/pypi/web/simple +``` + +### 常见问题解答 +- 在MAC M1芯片上安装时,可能会出现以下错误: +- - _cffi_backend.cpython-38-darwin.so' (mach-o file, but is an incompatible architecture (have (x86_64), need (arm64e))) + ```shell + pip uninstall cffi pycparser + ARCHFLAGS="-arch arm64" pip install cffi pycparser --compile --no-cache-dir + ``` diff --git a/docs/modelscope_pipeline/quick_start.md b/docs/modelscope_pipeline/quick_start.md index 7e35e9159..2b9219ba2 100644 --- a/docs/modelscope_pipeline/quick_start.md +++ b/docs/modelscope_pipeline/quick_start.md @@ -1,3 +1,5 @@ +([简体中文](./quick_start_zh.md)|English) + # Quick Start > **Note**: @@ -221,5 +223,4 @@ tail log.txt If you want finetune with multi-GPUs, you could: ```shell CUDA_VISIBLE_DEVICES=1,2 python -m torch.distributed.launch --nproc_per_node 2 finetune.py > log.txt 2>&1 -``` - +``` \ No newline at end of file diff --git a/docs/modelscope_pipeline/quick_start_zh.md b/docs/modelscope_pipeline/quick_start_zh.md new file mode 100644 index 000000000..91ad3c018 --- /dev/null +++ b/docs/modelscope_pipeline/quick_start_zh.md @@ -0,0 +1,227 @@ +(简体中文|[English](./quick_start.md)) + +# 快速使用 + +> **注意**: +> modelscope pipeline支持model zoo中的所有模型进行推理和微调。这里我们以typic模型为例来演示用法。 + + +## 使用pipeline进行推理 + +### 语音识别 +#### Paraformer模型 +```python +from modelscope.pipelines import pipeline +from modelscope.utils.constant import Tasks + +inference_pipeline = pipeline( + task=Tasks.auto_speech_recognition, + model='damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch', +) + +rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav') +print(rec_result) +# {'text': '欢迎大家来体验达摩院推出的语音识别模型'} +``` + +### 语音端点检测 +#### FSMN-VAD模型 +```python +from modelscope.pipelines import pipeline +from modelscope.utils.constant import Tasks +from modelscope.utils.logger import get_logger +import logging +logger = get_logger(log_level=logging.CRITICAL) +logger.setLevel(logging.CRITICAL) + +inference_pipeline = pipeline( + task=Tasks.voice_activity_detection, + model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch', + ) + +segments_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav') +print(segments_result) +# {'text': [[70, 2340], [2620, 6200], [6480, 23670], [23950, 26250], [26780, 28990], [29950, 31430], [31750, 37600], [38210, 46900], [47310, 49630], [49910, 56460], [56740, 59540], [59820, 70450]]} +``` + +### 标点恢复 +#### CT_Transformer模型 +```python +from modelscope.pipelines import pipeline +from modelscope.utils.constant import Tasks + +inference_pipeline = pipeline( + task=Tasks.punctuation, + model='damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch', + ) + +rec_result = inference_pipeline(text_in='我们都是木头人不会讲话不会动') +print(rec_result) +# {'text': '我们都是木头人,不会讲话,不会动。'} +``` + +### 时间戳预测 +#### TP-Aligner模型 +```python +from modelscope.pipelines import pipeline +from modelscope.utils.constant import Tasks + +inference_pipeline = pipeline( + task=Tasks.speech_timestamp, + model='damo/speech_timestamp_prediction-v1-16k-offline',) + +rec_result = inference_pipeline( + audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_timestamps.wav', + text_in='一 个 东 太 平 洋 国 家 为 什 么 跑 到 西 太 平 洋 来 了 呢',) +print(rec_result) +# {'text': ' 0.000 0.380;一 0.380 0.560;个 0.560 0.800;东 0.800 0.980;太 0.980 1.140;平 1.140 1.260;洋 1.260 1.440;国 1.440 1.680;家 1.680 1.920; 1.920 2.040;为 2.040 2.200;什 2.200 2.320;么 2.320 2.500;跑 2.500 2.680;到 2.680 2.860;西 2.860 3.040;太 3.040 3.200;平 3.200 3.380;洋 3.380 3.500;来 3.500 3.640;了 3.640 3.800;呢 3.800 4.150; 4.150 4.440;', 'timestamp': [[380, 560], [560, 800], [800, 980], [980, 1140], [1140, 1260], [1260, 1440], [1440, 1680], [1680, 1920], [2040, 2200], [2200, 2320], [2320, 2500], [2500, 2680], [2680, 2860], [2860, 3040], [3040, 3200], [3200, 3380], [3380, 3500], [3500, 3640], [3640, 3800], [3800, 4150]]} +``` + +### 说话人确认 +#### X-vector模型 +```python +from modelscope.pipelines import pipeline +from modelscope.utils.constant import Tasks +import numpy as np + +inference_sv_pipline = pipeline( + task=Tasks.speaker_verification, + model='damo/speech_xvector_sv-zh-cn-cnceleb-16k-spk3465-pytorch' +) + +# embedding extract +spk_embedding = inference_sv_pipline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_enroll.wav')["spk_embedding"] + +# speaker verification +rec_result = inference_sv_pipline(audio_in=('https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_enroll.wav','https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_same.wav')) +print(rec_result["scores"][0]) +# 0.8540499500025098 +``` + +### 说话人日志 +#### SOND模型 +```python +from modelscope.pipelines import pipeline +from modelscope.utils.constant import Tasks + +inference_diar_pipline = pipeline( + mode="sond_demo", + num_workers=0, + task=Tasks.speaker_diarization, + diar_model_config="sond.yaml", + model='damo/speech_diarization_sond-en-us-callhome-8k-n16k4-pytorch', + model_revision="v1.0.3", + sv_model="damo/speech_xvector_sv-en-us-callhome-8k-spk6135-pytorch", + sv_model_revision="v1.0.0", +) + +audio_list=[ + "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/record.wav", + "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/spk_A.wav", + "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/spk_B.wav", + "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/spk_B1.wav" +] + +results = inference_diar_pipline(audio_in=audio_list) +print(results) +# {'text': 'spk1 [(0.8, 1.84), (2.8, 6.16), (7.04, 10.64), (12.08, 12.8), (14.24, 15.6)]\nspk2 [(0.0, 1.12), (1.68, 3.2), (4.48, 7.12), (8.48, 9.04), (10.56, 14.48), (15.44, 16.0)]'} +``` + +### 常见问题 +#### 使用pipeline进行推理,如何在CPU与GPU进行切换 + +The pipeline defaults to decoding with GPU (`ngpu=1`) when GPU is available. If you want to switch to CPU, you could set `ngpu=0` +```python +inference_pipeline = pipeline( + task=Tasks.auto_speech_recognition, + model='damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch', + ngpu=0, +) +``` + +#### 如何从本地模型进行推理(不联网使用) +使用modelscope-sdk将模型下载到本地 + +```python +from modelscope.hub.snapshot_download import snapshot_download + +local_dir_root = "./models_from_modelscope" +model_dir = snapshot_download('damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch', cache_dir=local_dir_root) +``` + +或者使用git将模型下载到本地 +```shell +git lfs install +# git clone https://www.modelscope.cn//.git +git clone https://www.modelscope.cn/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch.git +``` + +从下载的本地模型进行推理(可以不联网使用) +```python +local_dir_root = "./models_from_modelscope/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" +inference_pipeline = pipeline( + task=Tasks.auto_speech_recognition, + model=local_dir_root, +) +``` + +## 使用pipeline进行微调 +### 语音识别 +#### Paraformer模型 + +finetune.py +```python +import os +from modelscope.metainfo import Trainers +from modelscope.trainers import build_trainer +from modelscope.msdatasets.audio.asr_dataset import ASRDataset + +def modelscope_finetune(params): + if not os.path.exists(params.output_dir): + os.makedirs(params.output_dir, exist_ok=True) + # dataset split ["train", "validation"] + ds_dict = ASRDataset.load(params.data_path, namespace='speech_asr') + kwargs = dict( + model=params.model, + data_dir=ds_dict, + dataset_type=params.dataset_type, + work_dir=params.output_dir, + batch_bins=params.batch_bins, + max_epoch=params.max_epoch, + lr=params.lr) + trainer = build_trainer(Trainers.speech_asr_trainer, default_args=kwargs) + trainer.train() + + +if __name__ == '__main__': + from funasr.utils.modelscope_param import modelscope_args + params = modelscope_args(model="damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch") + params.output_dir = "./checkpoint" # 模型保存路径 + params.data_path = "speech_asr_aishell1_trainsets" # 数据路径,可以为modelscope中已上传数据,也可以是本地数据 + params.dataset_type = "small" # 小数据量设置small,若数据量大于1000小时,请使用large + params.batch_bins = 2000 # batch size,如果dataset_type="small",batch_bins单位为fbank特征帧数,如果dataset_type="large",batch_bins单位为毫秒, + params.max_epoch = 50 # 最大训练轮数 + params.lr = 0.00005 # 设置学习率 + + modelscope_finetune(params) +``` + +```shell +python finetune.py &> log.txt & +``` +tail log.txt +``` +[bach-gpu011024008134] 2023-04-23 18:59:13,976 (e2e_asr_paraformer:467) INFO: enable sampler in paraformer, sampling_ratio: 0.75 +[bach-gpu011024008134] 2023-04-23 18:59:48,924 (trainer:777) INFO: 2epoch:train:1-50batch:50num_updates: iter_time=0.008, forward_time=0.302, loss_att=0.186, acc=0.942, loss_pre=0.005, loss=0.192, backward_time=0.231, optim_step_time=0.117, optim0_lr0=7.484e-06, train_time=0.753 +[bach-gpu011024008134] 2023-04-23 19:00:23,869 (trainer:777) INFO: 2epoch:train:51-100batch:100num_updates: iter_time=1.152e-04, forward_time=0.275, loss_att=0.184, acc=0.945, loss_pre=0.005, loss=0.189, backward_time=0.234, optim_step_time=0.117, optim0_lr0=7.567e-06, train_time=0.699 +[bach-gpu011024008134] 2023-04-23 19:00:58,463 (trainer:777) INFO: 2epoch:train:101-150batch:150num_updates: iter_time=1.123e-04, forward_time=0.271, loss_att=0.204, acc=0.942, loss_pre=0.005, loss=0.210, backward_time=0.231, optim_step_time=0.116, optim0_lr0=7.651e-06, train_time=0.692 +``` + +### 常见问题 +### 多GPU训练 + +可以使用下面的指令进行多GPU训练 +```shell +CUDA_VISIBLE_DEVICES=1,2 python -m torch.distributed.launch --nproc_per_node 2 finetune.py > log.txt 2>&1 +``` +