mirror of
https://github.com/modelscope/FunASR
synced 2025-09-15 14:48:36 +08:00
docs zh
This commit is contained in:
parent
89ab5d5a3b
commit
e22f256ee6
@ -71,12 +71,10 @@ Overview
|
||||
:maxdepth: 1
|
||||
:caption: Runtime and Service
|
||||
|
||||
./funasr/export/README.md
|
||||
./funasr/runtime/python/onnxruntime/README.md
|
||||
./funasr/runtime/docs/SDK_tutorial.md
|
||||
./funasr/runtime/python/websocket/README.md
|
||||
./funasr/runtime/websocket/readme.md
|
||||
./funasr/runtime/html5/readme.md
|
||||
./funasr/runtime/python/libtorch/README.md
|
||||
|
||||
|
||||
|
||||
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
@ -1,58 +0,0 @@
|
||||
# ModelScope Model
|
||||
|
||||
## How to finetune and infer using a pretrained Paraformer-large Model
|
||||
|
||||
### Finetune
|
||||
|
||||
- Modify finetune training related parameters in `finetune.py`
|
||||
- <strong>output_dir:</strong> # result dir
|
||||
- <strong>data_dir:</strong> # the dataset dir needs to include files: `train/wav.scp`, `train/text`; `validation/wav.scp`, `validation/text`
|
||||
- <strong>dataset_type:</strong> # for dataset larger than 1000 hours, set as `large`, otherwise set as `small`
|
||||
- <strong>batch_bins:</strong> # batch size. For dataset_type is `small`, `batch_bins` indicates the feature frames. For dataset_type is `large`, `batch_bins` indicates the duration in ms
|
||||
- <strong>max_epoch:</strong> # number of training epoch
|
||||
- <strong>lr:</strong> # learning rate
|
||||
|
||||
- Then you can run the pipeline to finetune with:
|
||||
```python
|
||||
python finetune.py
|
||||
```
|
||||
|
||||
### Inference
|
||||
|
||||
Or you can use the finetuned model for inference directly.
|
||||
|
||||
- Setting parameters in `infer.sh`
|
||||
- <strong>model:</strong> # model name on ModelScope
|
||||
- <strong>data_dir:</strong> # the dataset dir needs to include `test/wav.scp`. If `test/text` is also exists, CER will be computed
|
||||
- <strong>output_dir:</strong> # result dir
|
||||
- <strong>batch_size:</strong> # batchsize of inference
|
||||
- <strong>gpu_inference:</strong> # whether to perform gpu decoding, set false for cpu decoding
|
||||
- <strong>gpuid_list:</strong> # set gpus, e.g., gpuid_list="0,1"
|
||||
- <strong>njob:</strong> # the number of jobs for CPU decoding, if `gpu_inference`=false, use CPU decoding, please set `njob`
|
||||
|
||||
- Then you can run the pipeline to infer with:
|
||||
```python
|
||||
sh infer.sh
|
||||
```
|
||||
|
||||
- Results
|
||||
|
||||
The decoding results can be found in `$output_dir/1best_recog/text.cer`, which includes recognition results of each sample and the CER metric of the whole test set.
|
||||
|
||||
### Inference using local finetuned model
|
||||
|
||||
- Modify inference related parameters in `infer_after_finetune.py`
|
||||
- <strong>modelscope_model_name: </strong> # model name on ModelScope
|
||||
- <strong>output_dir:</strong> # result dir
|
||||
- <strong>data_dir:</strong> # the dataset dir needs to include `test/wav.scp`. If `test/text` is also exists, CER will be computed
|
||||
- <strong>decoding_model_name:</strong> # set the checkpoint name for decoding, e.g., `valid.cer_ctc.ave.pb`
|
||||
- <strong>batch_size:</strong> # batchsize of inference
|
||||
|
||||
- Then you can run the pipeline to finetune with:
|
||||
```python
|
||||
python infer_after_finetune.py
|
||||
```
|
||||
|
||||
- Results
|
||||
|
||||
The decoding results can be found in `$output_dir/decoding_results/text.cer`, which includes recognition results of each sample and the CER metric of the whole test set.
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README.md
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
1
egs_modelscope/asr_vad_punc/TEMPLATE
Symbolic link
1
egs_modelscope/asr_vad_punc/TEMPLATE
Symbolic link
@ -0,0 +1 @@
|
||||
../asr/TEMPLATE
|
||||
@ -1,246 +0,0 @@
|
||||
# Speech Recognition
|
||||
|
||||
> **Note**:
|
||||
> The modelscope pipeline supports all the models in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope) to inference and finetine. Here we take the typic models as examples to demonstrate the usage.
|
||||
|
||||
## Inference
|
||||
|
||||
### Quick start
|
||||
#### [Paraformer Model](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary)
|
||||
```python
|
||||
from modelscope.pipelines import pipeline
|
||||
from modelscope.utils.constant import Tasks
|
||||
|
||||
inference_pipeline = pipeline(
|
||||
task=Tasks.auto_speech_recognition,
|
||||
model='damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
|
||||
)
|
||||
|
||||
rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
|
||||
print(rec_result)
|
||||
```
|
||||
#### [Paraformer-online Model](https://www.modelscope.cn/models/damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8404-online/summary)
|
||||
```python
|
||||
inference_pipeline = pipeline(
|
||||
task=Tasks.auto_speech_recognition,
|
||||
model='damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8404-online',
|
||||
)
|
||||
import soundfile
|
||||
speech, sample_rate = soundfile.read("example/asr_example.wav")
|
||||
|
||||
param_dict = {"cache": dict(), "is_final": False}
|
||||
chunk_stride = 7680# 480ms
|
||||
# first chunk, 480ms
|
||||
speech_chunk = speech[0:chunk_stride]
|
||||
rec_result = inference_pipeline(audio_in=speech_chunk, param_dict=param_dict)
|
||||
print(rec_result)
|
||||
# next chunk, 480ms
|
||||
speech_chunk = speech[chunk_stride:chunk_stride+chunk_stride]
|
||||
rec_result = inference_pipeline(audio_in=speech_chunk, param_dict=param_dict)
|
||||
print(rec_result)
|
||||
```
|
||||
Full code of demo, please ref to [demo](https://github.com/alibaba-damo-academy/FunASR/discussions/241)
|
||||
|
||||
#### [UniASR Model](https://www.modelscope.cn/models/damo/speech_UniASR_asr_2pass-zh-cn-8k-common-vocab3445-pytorch-online/summary)
|
||||
There are three decoding mode for UniASR model(`fast`、`normal`、`offline`), for more model detailes, please refer to [docs](https://www.modelscope.cn/models/damo/speech_UniASR_asr_2pass-zh-cn-8k-common-vocab3445-pytorch-online/summary)
|
||||
```python
|
||||
decoding_model = "fast" # "fast"、"normal"、"offline"
|
||||
inference_pipeline = pipeline(
|
||||
task=Tasks.auto_speech_recognition,
|
||||
model='damo/speech_UniASR_asr_2pass-minnan-16k-common-vocab3825',
|
||||
param_dict={"decoding_model": decoding_model})
|
||||
|
||||
rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
|
||||
print(rec_result)
|
||||
```
|
||||
The decoding mode of `fast` and `normal` is fake streaming, which could be used for evaluating of recognition accuracy.
|
||||
Full code of demo, please ref to [demo](https://github.com/alibaba-damo-academy/FunASR/discussions/151)
|
||||
#### [RNN-T-online model]()
|
||||
Undo
|
||||
|
||||
#### [MFCCA Model](https://www.modelscope.cn/models/NPU-ASLP/speech_mfcca_asr-zh-cn-16k-alimeeting-vocab4950/summary)
|
||||
For more model detailes, please refer to [docs](https://www.modelscope.cn/models/NPU-ASLP/speech_mfcca_asr-zh-cn-16k-alimeeting-vocab4950/summary)
|
||||
```python
|
||||
from modelscope.pipelines import pipeline
|
||||
from modelscope.utils.constant import Tasks
|
||||
|
||||
inference_pipeline = pipeline(
|
||||
task=Tasks.auto_speech_recognition,
|
||||
model='NPU-ASLP/speech_mfcca_asr-zh-cn-16k-alimeeting-vocab4950',
|
||||
model_revision='v3.0.0'
|
||||
)
|
||||
|
||||
rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
|
||||
print(rec_result)
|
||||
```
|
||||
|
||||
#### API-reference
|
||||
##### Define pipeline
|
||||
- `task`: `Tasks.auto_speech_recognition`
|
||||
- `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
|
||||
- `ngpu`: `1` (Default), decoding on GPU. If ngpu=0, decoding on CPU
|
||||
- `ncpu`: `1` (Default), sets the number of threads used for intraop parallelism on CPU
|
||||
- `output_dir`: `None` (Default), the output path of results if set
|
||||
- `batch_size`: `1` (Default), batch size when decoding
|
||||
##### Infer pipeline
|
||||
- `audio_in`: the input to decode, which could be:
|
||||
- wav_path, `e.g.`: asr_example.wav,
|
||||
- pcm_path, `e.g.`: asr_example.pcm,
|
||||
- audio bytes stream, `e.g.`: bytes data from a microphone
|
||||
- audio sample point,`e.g.`: `audio, rate = soundfile.read("asr_example_zh.wav")`, the dtype is numpy.ndarray or torch.Tensor
|
||||
- wav.scp, kaldi style wav list (`wav_id \t wav_path`), `e.g.`:
|
||||
```text
|
||||
asr_example1 ./audios/asr_example1.wav
|
||||
asr_example2 ./audios/asr_example2.wav
|
||||
```
|
||||
In this case of `wav.scp` input, `output_dir` must be set to save the output results
|
||||
- `audio_fs`: audio sampling rate, only set when audio_in is pcm audio
|
||||
- `output_dir`: None (Default), the output path of results if set
|
||||
|
||||
### Inference with multi-thread CPUs or multi GPUs
|
||||
FunASR also offer recipes [egs_modelscope/asr/TEMPLATE/infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr/TEMPLATE/infer.sh) to decode with multi-thread CPUs, or multi GPUs.
|
||||
|
||||
- Setting parameters in `infer.sh`
|
||||
- `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
|
||||
- `data_dir`: the dataset dir needs to include `wav.scp`. If `${data_dir}/text` is also exists, CER will be computed
|
||||
- `output_dir`: output dir of the recognition results
|
||||
- `batch_size`: `64` (Default), batch size of inference on gpu
|
||||
- `gpu_inference`: `true` (Default), whether to perform gpu decoding, set false for CPU inference
|
||||
- `gpuid_list`: `0,1` (Default), which gpu_ids are used to infer
|
||||
- `njob`: only used for CPU inference (`gpu_inference`=`false`), `64` (Default), the number of jobs for CPU decoding
|
||||
- `checkpoint_dir`: only used for infer finetuned models, the path dir of finetuned models
|
||||
- `checkpoint_name`: only used for infer finetuned models, `valid.cer_ctc.ave.pb` (Default), which checkpoint is used to infer
|
||||
- `decoding_mode`: `normal` (Default), decoding mode for UniASR model(fast、normal、offline)
|
||||
- `hotword_txt`: `None` (Default), hotword file for contextual paraformer model(the hotword file name ends with .txt")
|
||||
|
||||
- Decode with multi GPUs:
|
||||
```shell
|
||||
bash infer.sh \
|
||||
--model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
|
||||
--data_dir "./data/test" \
|
||||
--output_dir "./results" \
|
||||
--batch_size 64 \
|
||||
--gpu_inference true \
|
||||
--gpuid_list "0,1"
|
||||
```
|
||||
- Decode with multi-thread CPUs:
|
||||
```shell
|
||||
bash infer.sh \
|
||||
--model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
|
||||
--data_dir "./data/test" \
|
||||
--output_dir "./results" \
|
||||
--gpu_inference false \
|
||||
--njob 64
|
||||
```
|
||||
|
||||
- Results
|
||||
|
||||
The decoding results can be found in `$output_dir/1best_recog/text.cer`, which includes recognition results of each sample and the CER metric of the whole test set.
|
||||
|
||||
If you decode the SpeechIO test sets, you can use textnorm with `stage`=3, and `DETAILS.txt`, `RESULTS.txt` record the results and CER after text normalization.
|
||||
|
||||
|
||||
## Finetune with pipeline
|
||||
|
||||
### Quick start
|
||||
[finetune.py](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr/TEMPLATE/finetune.py)
|
||||
```python
|
||||
import os
|
||||
from modelscope.metainfo import Trainers
|
||||
from modelscope.trainers import build_trainer
|
||||
from modelscope.msdatasets.audio.asr_dataset import ASRDataset
|
||||
|
||||
def modelscope_finetune(params):
|
||||
if not os.path.exists(params.output_dir):
|
||||
os.makedirs(params.output_dir, exist_ok=True)
|
||||
# dataset split ["train", "validation"]
|
||||
ds_dict = ASRDataset.load(params.data_path, namespace='speech_asr')
|
||||
kwargs = dict(
|
||||
model=params.model,
|
||||
data_dir=ds_dict,
|
||||
dataset_type=params.dataset_type,
|
||||
work_dir=params.output_dir,
|
||||
batch_bins=params.batch_bins,
|
||||
max_epoch=params.max_epoch,
|
||||
lr=params.lr)
|
||||
trainer = build_trainer(Trainers.speech_asr_trainer, default_args=kwargs)
|
||||
trainer.train()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
from funasr.utils.modelscope_param import modelscope_args
|
||||
params = modelscope_args(model="damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch")
|
||||
params.output_dir = "./checkpoint" # 模型保存路径
|
||||
params.data_path = "speech_asr_aishell1_trainsets" # 数据路径,可以为modelscope中已上传数据,也可以是本地数据
|
||||
params.dataset_type = "small" # 小数据量设置small,若数据量大于1000小时,请使用large
|
||||
params.batch_bins = 2000 # batch size,如果dataset_type="small",batch_bins单位为fbank特征帧数,如果dataset_type="large",batch_bins单位为毫秒,
|
||||
params.max_epoch = 50 # 最大训练轮数
|
||||
params.lr = 0.00005 # 设置学习率
|
||||
|
||||
modelscope_finetune(params)
|
||||
```
|
||||
|
||||
```shell
|
||||
python finetune.py &> log.txt &
|
||||
```
|
||||
|
||||
### Finetune with your data
|
||||
|
||||
- Modify finetune training related parameters in [finetune.py](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr/TEMPLATE/finetune.py)
|
||||
- `output_dir`: result dir
|
||||
- `data_dir`: the dataset dir needs to include files: `train/wav.scp`, `train/text`; `validation/wav.scp`, `validation/text`
|
||||
- `dataset_type`: for dataset larger than 1000 hours, set as `large`, otherwise set as `small`
|
||||
- `batch_bins`: batch size. For dataset_type is `small`, `batch_bins` indicates the feature frames. For dataset_type is `large`, `batch_bins` indicates the duration in ms
|
||||
- `max_epoch`: number of training epoch
|
||||
- `lr`: learning rate
|
||||
|
||||
- Training data formats:
|
||||
```sh
|
||||
cat ./example_data/text
|
||||
BAC009S0002W0122 而 对 楼 市 成 交 抑 制 作 用 最 大 的 限 购
|
||||
BAC009S0002W0123 也 成 为 地 方 政 府 的 眼 中 钉
|
||||
english_example_1 hello world
|
||||
english_example_2 go swim 去 游 泳
|
||||
|
||||
cat ./example_data/wav.scp
|
||||
BAC009S0002W0122 /mnt/data/wav/train/S0002/BAC009S0002W0122.wav
|
||||
BAC009S0002W0123 /mnt/data/wav/train/S0002/BAC009S0002W0123.wav
|
||||
english_example_1 /mnt/data/wav/train/S0002/english_example_1.wav
|
||||
english_example_2 /mnt/data/wav/train/S0002/english_example_2.wav
|
||||
```
|
||||
|
||||
- Then you can run the pipeline to finetune with:
|
||||
```shell
|
||||
python finetune.py
|
||||
```
|
||||
If you want finetune with multi-GPUs, you could:
|
||||
```shell
|
||||
CUDA_VISIBLE_DEVICES=1,2 python -m torch.distributed.launch --nproc_per_node 2 finetune.py > log.txt 2>&1
|
||||
```
|
||||
## Inference with your finetuned model
|
||||
|
||||
- Setting parameters in [egs_modelscope/asr/TEMPLATE/infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr/TEMPLATE/infer.sh) is the same with [docs](https://github.com/alibaba-damo-academy/FunASR/tree/main/egs_modelscope/asr/TEMPLATE#inference-with-multi-thread-cpus-or-multi-gpus), `model` is the model name from modelscope, which you finetuned.
|
||||
|
||||
- Decode with multi GPUs:
|
||||
```shell
|
||||
bash infer.sh \
|
||||
--model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
|
||||
--data_dir "./data/test" \
|
||||
--output_dir "./results" \
|
||||
--batch_size 64 \
|
||||
--gpu_inference true \
|
||||
--gpuid_list "0,1" \
|
||||
--checkpoint_dir "./checkpoint" \
|
||||
--checkpoint_name "valid.cer_ctc.ave.pb"
|
||||
```
|
||||
- Decode with multi-thread CPUs:
|
||||
```shell
|
||||
bash infer.sh \
|
||||
--model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
|
||||
--data_dir "./data/test" \
|
||||
--output_dir "./results" \
|
||||
--gpu_inference false \
|
||||
--njob 64 \
|
||||
--checkpoint_dir "./checkpoint" \
|
||||
--checkpoint_name "valid.cer_ctc.ave.pb"
|
||||
```
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README.md
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
@ -1,3 +1,5 @@
|
||||
([简体中文](./README_zh.md)|English)
|
||||
|
||||
# Punctuation Restoration
|
||||
|
||||
> **Note**:
|
||||
|
||||
112
egs_modelscope/punctuation/TEMPLATE/README_zh.md
Normal file
112
egs_modelscope/punctuation/TEMPLATE/README_zh.md
Normal file
@ -0,0 +1,112 @@
|
||||
(简体中文|[English](./README.md))
|
||||
# 标点预测
|
||||
|
||||
> **Note**:
|
||||
> Pipeline 支持在[modelscope模型仓库](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope)中的所有模型进行推理和微调。在这里,我们以 CT-Transformer 模型为例来演示使用方法。
|
||||
|
||||
## 推理
|
||||
|
||||
### 快速使用
|
||||
#### [CT-Transformer 模型](https://www.modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/summary)
|
||||
```python
|
||||
from modelscope.pipelines import pipeline
|
||||
from modelscope.utils.constant import Tasks
|
||||
|
||||
inference_pipeline = pipeline(
|
||||
task=Tasks.punctuation,
|
||||
model='damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch',
|
||||
model_revision=None)
|
||||
|
||||
rec_result = inference_pipeline(text_in='example/punc_example.txt')
|
||||
print(rec_result)
|
||||
```
|
||||
- text二进制数据,例如:用户直接从文件里读出bytes数据
|
||||
```python
|
||||
rec_result = inference_pipeline(text_in='我们都是木头人不会讲话不会动')
|
||||
```
|
||||
- text文件url,例如:https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/punc_example.txt
|
||||
```python
|
||||
rec_result = inference_pipeline(text_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/punc_example.txt')
|
||||
```
|
||||
|
||||
#### [CT-Transformer 实时模型](https://www.modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727/summary)
|
||||
```python
|
||||
from modelscope.pipelines import pipeline
|
||||
from modelscope.utils.constant import Tasks
|
||||
|
||||
inference_pipeline = pipeline(
|
||||
task=Tasks.punctuation,
|
||||
model='damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727',
|
||||
model_revision=None,
|
||||
)
|
||||
|
||||
inputs = "跨境河流是养育沿岸|人民的生命之源长期以来为帮助下游地区防灾减灾中方技术人员|在上游地区极为恶劣的自然条件下克服巨大困难甚至冒着生命危险|向印方提供汛期水文资料处理紧急事件中方重视印方在跨境河流问题上的关切|愿意进一步完善双方联合工作机制|凡是|中方能做的我们|都会去做而且会做得更好我请印度朋友们放心中国在上游的|任何开发利用都会经过科学|规划和论证兼顾上下游的利益"
|
||||
vads = inputs.split("|")
|
||||
rec_result_all="outputs:"
|
||||
param_dict = {"cache": []}
|
||||
for vad in vads:
|
||||
rec_result = inference_pipeline(text_in=vad, param_dict=param_dict)
|
||||
rec_result_all += rec_result['text']
|
||||
|
||||
print(rec_result_all)
|
||||
```
|
||||
演示例子完整代码请参考 [demo](https://github.com/alibaba-damo-academy/FunASR/discussions/238)
|
||||
|
||||
### API接口说明
|
||||
#### pipeline定义
|
||||
- `task`: `Tasks.punctuation`
|
||||
- `model`: [模型仓库](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope) 中的模型名称,或本地磁盘中的模型路径
|
||||
- `ngpu`: `1`(默认),使用 GPU 进行推理。如果 ngpu=0,则使用 CPU 进行推理
|
||||
- `ncpu`: `1` (默认),设置用于 CPU 内部操作并行性的线程数
|
||||
- `output_dir`: `None` (默认),如果设置,输出结果的输出路径
|
||||
- `model_revision`: `None`(默认),modelscope中版本版本号
|
||||
|
||||
|
||||
#### pipeline推理
|
||||
- `text_in`: 需要进行推理的输入,支持一下输入:
|
||||
- 文本字符,例如:"我们都是木头人不会讲话不会动"
|
||||
- 文本文件,例如:example/punc_example.txt。
|
||||
在使用文本文件 输入时,必须设置 `output_dir` 以保存输出结果。
|
||||
- `param_dict`: 在实时模式下必要的缓存。
|
||||
|
||||
### Inference with multi-thread CPUs or multi GPUs
|
||||
FunASR 还提供了 [egs_modelscope/punctuation/TEMPLATE/infer.sh](infer.sh) 脚本,以使用多线程 CPU 或多个 GPU 进行解码。
|
||||
|
||||
#### `infer.sh` 设置
|
||||
- `model`: [modelscope模型仓库](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope)中的模型名称,或本地磁盘中的模型路径
|
||||
- `data_dir`: 数据集目录需要包括 `punc.txt` 文件
|
||||
- `output_dir`: 识别结果的输出目录
|
||||
- `batch_size`: `1`(默认),在 GPU 上进行推理的批处理大小
|
||||
- `gpu_inference`: `true` (默认),是否执行 GPU 解码,如果进行 CPU 推理,则设置为 `false`
|
||||
- `gpuid_list`: `0,1` (默认),用于推理的 GPU ID
|
||||
- `njob`: 仅用于 CPU 推理(`gpu_inference=false`),`64`(默认),CPU 解码的作业数
|
||||
|
||||
|
||||
#### 使用多个 GPU 进行解码:
|
||||
```shell
|
||||
bash infer.sh \
|
||||
--model "damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch" \
|
||||
--data_dir "./data/test" \
|
||||
--output_dir "./results" \
|
||||
--batch_size 1 \
|
||||
--gpu_inference true \
|
||||
--gpuid_list "0,1"
|
||||
```
|
||||
#### 使用多线程 CPU 进行解码:
|
||||
```shell
|
||||
bash infer.sh \
|
||||
--model "damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch" \
|
||||
--data_dir "./data/test" \
|
||||
--output_dir "./results" \
|
||||
--gpu_inference false \
|
||||
--njob 1
|
||||
```
|
||||
|
||||
## Finetune with pipeline
|
||||
|
||||
### Quick start
|
||||
|
||||
### Finetune with your data
|
||||
|
||||
## Inference with your finetuned model
|
||||
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
@ -1,3 +1,5 @@
|
||||
([简体中文](./README_zh.md)|English)
|
||||
|
||||
# Timestamp Prediction (FA)
|
||||
|
||||
## Inference
|
||||
|
||||
102
egs_modelscope/tp/TEMPLATE/README_zh.md
Normal file
102
egs_modelscope/tp/TEMPLATE/README_zh.md
Normal file
@ -0,0 +1,102 @@
|
||||
(简体中文|[English](./README.md))
|
||||
|
||||
# 时间戳预测
|
||||
|
||||
## 推理
|
||||
|
||||
### 快速使用
|
||||
#### [TP-Aligner 模型](https://modelscope.cn/models/damo/speech_timestamp_prediction-v1-16k-offline/summary)
|
||||
```python
|
||||
from modelscope.pipelines import pipeline
|
||||
from modelscope.utils.constant import Tasks
|
||||
|
||||
inference_pipeline = pipeline(
|
||||
task=Tasks.speech_timestamp,
|
||||
model='damo/speech_timestamp_prediction-v1-16k-offline',
|
||||
model_revision='v1.1.0')
|
||||
|
||||
rec_result = inference_pipeline(
|
||||
audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_timestamps.wav',
|
||||
text_in='一 个 东 太 平 洋 国 家 为 什 么 跑 到 西 太 平 洋 来 了 呢',)
|
||||
print(rec_result)
|
||||
```
|
||||
|
||||
Timestamp pipeline can also be used after ASR pipeline to compose complete ASR function, ref to [demo](https://github.com/alibaba-damo-academy/FunASR/discussions/246).
|
||||
|
||||
### API接口说明
|
||||
#### pipeline定义
|
||||
- `task`: `Tasks.speech_timestamp`
|
||||
- `model`: [模型仓库](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope) 中的模型名称,或本地磁盘中的模型路径
|
||||
- `ngpu`: `1`(默认),使用 GPU 进行推理。如果 ngpu=0,则使用 CPU 进行推理
|
||||
- `ncpu`: `1` (默认),设置用于 CPU 内部操作并行性的线程数
|
||||
- `output_dir`: `None` (默认),如果设置,输出结果的输出路径
|
||||
- `batch_size`: `1` (默认),解码时的批处理大小
|
||||
|
||||
|
||||
#### Infer pipeline
|
||||
- `audio_in`: 待预测的输入语音,可以是:
|
||||
- wav文件路径,例如:asr_example.wav(本地或 URL 上的 wav 文件)
|
||||
- wav.scp,kaldi 风格的 wav 列表 (`wav_id wav_path`),例如:
|
||||
```text
|
||||
asr_example1 ./audios/asr_example1.wav
|
||||
asr_example2 ./audios/asr_example2.wav
|
||||
```
|
||||
在使用 `wav.scp` 输入时,必须设置 `output_dir` 以保存输出结果。
|
||||
- `text_in`: 待预测的输入文本,使用空格分隔,可以是:
|
||||
- 文本字符串,例如:`今 天 天 气 怎 么 样`
|
||||
- text.scp,kaldi 风格的文本文件(`wav_id transcription`),例如:
|
||||
```text
|
||||
asr_example1 今 天 天 气 怎 么 样
|
||||
asr_example2 欢 迎 体 验 达 摩 院 语 音 识 别 模 型
|
||||
```
|
||||
- `audio_fs`: 音频采样率,仅在输入为 PCM 音频时设置
|
||||
- `output_dir`: 默认为 None,如果设置,则为结果的输出路径,包含:
|
||||
- output_dir/timestamp_prediction/tp_sync,带有静音段的以秒为单位的时间戳,`wav_id# token1 start_time end_time;`,例如:
|
||||
```text
|
||||
test_wav1# <sil> 0.000 0.500;温 0.500 0.680;州 0.680 0.840;化 0.840 1.040;工 1.040 1.280;仓 1.280 1.520;<sil> 1.520 1.680;库 1.680 1.920;<sil> 1.920 2.160;起 2.160 2.380;火 2.380 2.580;殃 2.580 2.760;及 2.760 2.920;附 2.920 3.100;近 3.100 3.340;<sil> 3.340 3.400;河 3.400 3.640;<sil> 3.640 3.700;流 3.700 3.940;<sil> 3.940 4.240;大 4.240 4.400;量 4.400 4.520;死 4.520 4.680;鱼 4.680 4.920;<sil> 4.920 4.940;漂 4.940 5.120;浮 5.120 5.300;河 5.300 5.500;面 5.500 5.900;<sil> 5.900 6.240;
|
||||
```
|
||||
- output_dir/timestamp_prediction/tp_time,无静音的时间戳列表,以毫秒为单位,与输入文本长度相同,`wav_id# [[start_time, end_time],]`,例如:
|
||||
```text
|
||||
test_wav1# [[500, 680], [680, 840], [840, 1040], [1040, 1280], [1280, 1520], [1680, 1920], [2160, 2380], [2380, 2580], [2580, 2760], [2760, 2920], [2920, 3100], [3100, 3340], [3400, 3640], [3700, 3940], [4240, 4400], [4400, 4520], [4520, 4680], [4680, 4920], [4940, 5120], [5120, 5300], [5300, 5500], [5500, 5900]]
|
||||
```
|
||||
|
||||
### 使用多线程 CPU 或多个 GPU 进行推理
|
||||
FunASR 还提供了 [egs_modelscope/tp/TEMPLATE/infer.sh](infer.sh) 脚本,以使用多线程 CPU 或多个 GPU 进行解码。
|
||||
|
||||
#### `infer.sh` 设置
|
||||
- `model`: [modelscope模型仓库](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope)中的模型名称,或本地磁盘中的模型路径
|
||||
- `data_dir`: 数据集目录需要包括 `wav.scp` 文件。如果 `${data_dir}/text` 也存在,则将计算 CER
|
||||
- `output_dir`: 识别结果的输出目录
|
||||
- `batch_size`: `1`(默认),在 GPU 上进行推理的批处理大小
|
||||
- `gpu_inference`: `true` (默认),是否执行 GPU 解码,如果进行 CPU 推理,则设置为 `false`
|
||||
- `gpuid_list`: `0,1` (默认),用于推理的 GPU ID
|
||||
- `njob`: 仅用于 CPU 推理(`gpu_inference=false`),`64`(默认),CPU 解码的作业数
|
||||
|
||||
#### 使用多个 GPU 进行解码:
|
||||
```shell
|
||||
bash infer.sh \
|
||||
--model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
|
||||
--data_dir "./data/test" \
|
||||
--output_dir "./results" \
|
||||
--batch_size 1 \
|
||||
--gpu_inference true \
|
||||
--gpuid_list "0,1"
|
||||
```
|
||||
#### 使用多线程 CPU 进行解码:
|
||||
```shell
|
||||
bash infer.sh \
|
||||
--model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
|
||||
--data_dir "./data/test" \
|
||||
--output_dir "./results" \
|
||||
--gpu_inference false \
|
||||
--njob 1
|
||||
```
|
||||
|
||||
## Finetune with pipeline
|
||||
|
||||
### Quick start
|
||||
|
||||
### Finetune with your data
|
||||
|
||||
## Inference with your finetuned model
|
||||
|
||||
@ -1,3 +1,5 @@
|
||||
([简体中文](./README_zh.md)|English)
|
||||
|
||||
# Voice Activity Detection
|
||||
|
||||
> **Note**:
|
||||
|
||||
113
egs_modelscope/vad/TEMPLATE/README_zh.md
Normal file
113
egs_modelscope/vad/TEMPLATE/README_zh.md
Normal file
@ -0,0 +1,113 @@
|
||||
(简体中文|[English](./README.md))
|
||||
|
||||
# 语音端点检测
|
||||
|
||||
> **注意**:
|
||||
> Pipeline 支持在[modelscope模型仓库](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope)中的所有模型进行推理和微调。在这里,我们以 FSMN-VAD 模型为例来演示使用方法。
|
||||
|
||||
## 推理
|
||||
|
||||
### 快速使用
|
||||
#### [FSMN-VAD 模型](https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/summary)
|
||||
```python
|
||||
from modelscope.pipelines import pipeline
|
||||
from modelscope.utils.constant import Tasks
|
||||
|
||||
inference_pipeline = pipeline(
|
||||
task=Tasks.voice_activity_detection,
|
||||
model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch',
|
||||
)
|
||||
|
||||
segments_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav')
|
||||
print(segments_result)
|
||||
```
|
||||
#### [FSMN-VAD-实时模型](https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/summary)
|
||||
```python
|
||||
inference_pipeline = pipeline(
|
||||
task=Tasks.auto_speech_recognition,
|
||||
model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch',
|
||||
)
|
||||
import soundfile
|
||||
speech, sample_rate = soundfile.read("example/asr_example.wav")
|
||||
|
||||
param_dict = {"in_cache": dict(), "is_final": False}
|
||||
chunk_stride = 1600# 100ms
|
||||
# first chunk, 100ms
|
||||
speech_chunk = speech[0:chunk_stride]
|
||||
rec_result = inference_pipeline(audio_in=speech_chunk, param_dict=param_dict)
|
||||
print(rec_result)
|
||||
# next chunk, 480ms
|
||||
speech_chunk = speech[chunk_stride:chunk_stride+chunk_stride]
|
||||
rec_result = inference_pipeline(audio_in=speech_chunk, param_dict=param_dict)
|
||||
print(rec_result)
|
||||
```
|
||||
演示示例,完整代码请参考 [demo](https://github.com/alibaba-damo-academy/FunASR/discussions/236)
|
||||
|
||||
|
||||
|
||||
### API接口说明
|
||||
#### pipeline定义
|
||||
- `task`: `Tasks.voice_activity_detection`
|
||||
- `model`: [模型仓库](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope) 中的模型名称,或本地磁盘中的模型路径
|
||||
- `ngpu`: `1`(默认),使用 GPU 进行推理。如果 ngpu=0,则使用 CPU 进行推理
|
||||
- `ncpu`: `1` (默认),设置用于 CPU 内部操作并行性的线程数
|
||||
- `output_dir`: `None` (默认),如果设置,输出结果的输出路径
|
||||
- `batch_size`: `1` (默认),解码时的批处理大小
|
||||
#### pipeline 推理
|
||||
- `audio_in`: 要解码的输入,可以是:
|
||||
- wav文件路径, 例如: asr_example.wav,
|
||||
- pcm文件路径, 例如: asr_example.pcm,
|
||||
- 音频字节数流,例如:麦克风的字节数数据
|
||||
- 音频采样点,例如:`audio, rate = soundfile.read("asr_example_zh.wav")`, 数据类型为 numpy.ndarray 或者 torch.Tensor
|
||||
- wav.scp,kaldi 样式的 wav 列表 (`wav_id \t wav_path`), 例如:
|
||||
```text
|
||||
asr_example1 ./audios/asr_example1.wav
|
||||
asr_example2 ./audios/asr_example2.wav
|
||||
```
|
||||
在这种输入 `wav.scp` 的情况下,必须设置 `output_dir` 以保存输出结果
|
||||
- `audio_fs`: 音频采样率,仅在 audio_in 为 pcm 音频时设置
|
||||
- `output_dir`: None (默认),如果设置,输出结果的输出路径
|
||||
|
||||
|
||||
### 使用多线程 CPU 或多个 GPU 进行推理
|
||||
FunASR 还提供了 [egs_modelscope/vad/TEMPLATE/infer.sh](infer.sh) 脚本,以使用多线程 CPU 或多个 GPU 进行解码。
|
||||
|
||||
#### `infer.sh` 设置
|
||||
- `model`: [modelscope模型仓库](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope)中的模型名称,或本地磁盘中的模型路径
|
||||
- `data_dir`: 数据集目录需要包括 `wav.scp` 文件。如果 `${data_dir}/text` 也存在,则将计算 CER
|
||||
- `output_dir`: 识别结果的输出目录
|
||||
- `batch_size`: `1`(默认),在 GPU 上进行推理的批处理大小
|
||||
- `gpu_inference`: `true` (默认),是否执行 GPU 解码,如果进行 CPU 推理,则设置为 `false`
|
||||
- `gpuid_list`: `0,1` (默认),用于推理的 GPU ID
|
||||
- `njob`: 仅用于 CPU 推理(`gpu_inference=false`),`64`(默认),CPU 解码的作业数
|
||||
|
||||
#### 使用多个 GPU 进行解码:
|
||||
```shell
|
||||
bash infer.sh \
|
||||
--model "damo/speech_fsmn_vad_zh-cn-16k-common-pytorch" \
|
||||
--data_dir "./data/test" \
|
||||
--output_dir "./results" \
|
||||
--batch_size 1 \
|
||||
--gpu_inference true \
|
||||
--gpuid_list "0,1"
|
||||
```
|
||||
#### 使用多线程 CPU 进行解码:
|
||||
```shell
|
||||
bash infer.sh \
|
||||
--model "damo/speech_fsmn_vad_zh-cn-16k-common-pytorch" \
|
||||
--data_dir "./data/test" \
|
||||
--output_dir "./results" \
|
||||
--gpu_inference false \
|
||||
--njob 64
|
||||
```
|
||||
|
||||
|
||||
|
||||
## Finetune with pipeline
|
||||
|
||||
### Quick start
|
||||
|
||||
### Finetune with your data
|
||||
|
||||
## Inference with your finetuned model
|
||||
|
||||
1
egs_modelscope/vad/speech_fsmn_vad_zh-cn-16k-common/README_zh.md
Symbolic link
1
egs_modelscope/vad/speech_fsmn_vad_zh-cn-16k-common/README_zh.md
Symbolic link
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
1
egs_modelscope/vad/speech_fsmn_vad_zh-cn-8k-common/README_zh.md
Symbolic link
1
egs_modelscope/vad/speech_fsmn_vad_zh-cn-8k-common/README_zh.md
Symbolic link
@ -0,0 +1 @@
|
||||
../TEMPLATE/README_zh.md
|
||||
@ -1,328 +1,206 @@
|
||||
# FunASR File Transcription Service Convenient Deployment Tutorial
|
||||
([简体中文](./SDK_tutorial_zh.md)|English)
|
||||
|
||||
FunASR provides offline file transcription services that can be conveniently deployed on local or cloud servers. The core of the service is based on the open-source runtime-SDK of FunASR. It integrates various related capabilities, such as voice endpoint detection (VAD) and Paraformer-large speech recognition (ASR), as well as punctuation recovery (PUNC), which have been open-sourced by the speech laboratory of DAMO Academy on the Modelscope community. With these capabilities, the service can transcribe audio accurately and efficiently under high concurrency.
|
||||
# FunASR Offline File Transcription Service Convenient Deployment Tutorial
|
||||
|
||||
## Installation and Start Service
|
||||
FunASR provides an offline file transcription service that can be easily deployed on a local or cloud server. The core is the FunASR open-source runtime-SDK. It integrates various capabilities such as speech endpoint detection (VAD) and Paraformer-large speech recognition (ASR) and punctuation restoration (PUNC) released by the speech laboratory of the Damo Academy in the Modelscope community. It has a complete speech recognition chain and can recognize audio or video of tens of hours into punctuated text. Moreover, it supports transcription for hundreds of simultaneous requests.
|
||||
|
||||
Environment Preparation and Configuration([docs](./aliyun_server_tutorial.md))
|
||||
## Server Configuration
|
||||
|
||||
### Downloading Tools and Deployment
|
||||
Users can choose appropriate server configurations based on their business needs. The recommended configurations are:
|
||||
- Configuration 1: (X86, computing-type) 4-core vCPU, 8GB memory, and a single machine can support about 32 requests.
|
||||
- Configuration 2: (X86, computing-type) 16-core vCPU, 32GB memory, and a single machine can support about 64 requests.
|
||||
- Configuration 3: (X86, computing-type) 64-core vCPU, 128GB memory, and a single machine can support about 200 requests.
|
||||
|
||||
Run the following command to perform a one-click deployment of the FunASR runtime-SDK service. Follow the prompts to complete the deployment and running of the service. Currently, only Linux environments are supported, and for other environments, please refer to the Advanced SDK Development Guide ([docs](./SDK_advanced_guide_offline.md)).
|
||||
Detailed performance [report](./benchmark_onnx_cpp.md)
|
||||
|
||||
[//]: # (Due to network restrictions, the download of the funasr-runtime-deploy.sh one-click deployment tool may not proceed smoothly. If the tool has not been downloaded and entered into the one-click deployment tool after several seconds, please terminate it with Ctrl + C and run the following command again.)
|
||||
Cloud service providers offer a 3-month free trial for new users. Application tutorial ([docs](./aliyun_server_tutorial.md)).
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Server Startup
|
||||
|
||||
`Note`: The one-click deployment tool process includes installing Docker, downloading Docker images, and starting the service. If the user wants to start from the FunASR Docker image, please refer to the development guide ([docs](./SDK_advanced_guide_offline.md).
|
||||
|
||||
Download the deployment tool `funasr-runtime-deploy-offline-cpu-zh.sh`
|
||||
|
||||
```shell
|
||||
curl -O https://raw.githubusercontent.com/alibaba-damo-academy/FunASR-APP/main/TransAudio/funasr-runtime-deploy.sh; sudo bash funasr-runtime-deploy.sh install
|
||||
# For the users in China, you could install with the command:
|
||||
# curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/funasr-runtime-deploy.sh; sudo bash funasr-runtime-deploy.sh install
|
||||
curl -O https://raw.githubusercontent.com/alibaba-damo-academy/FunASR/main/funasr/runtime/deploy_tools/funasr-runtime-deploy-offline-cpu-en.sh;
|
||||
# If there is a network problem, users in mainland China can use the following command:
|
||||
# curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/funasr-runtime-deploy-offline-cpu-en.sh;
|
||||
```
|
||||
|
||||
#### Details of Configuration
|
||||
Execute the deployment tool and press the Enter key at the prompt to complete the installation and deployment of the server. Currently, the convenient deployment tool only supports Linux environments. For other environments, please refer to the development guide ([docs](./SDK_advanced_guide_offline.md)).
|
||||
```shell
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh install --workspace /root/funasr-runtime-resources
|
||||
```
|
||||
|
||||
##### Choosing FunASR Docker Image
|
||||
### Client Testing and Usage
|
||||
|
||||
We recommend selecting the "latest" tag to use our latest image, but you can also choose from our historical versions.
|
||||
After running the above installation instructions, the client testing tool directory samples will be downloaded in the default installation directory /root/funasr-runtime-resources ([download click](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/sample/funasr_samples.tar.gz)).
|
||||
We take the Python language client as an example to explain that it supports multiple audio format inputs (such as .wav, .pcm, .mp3, etc.), video inputs (.mp4, etc.), and multiple file list wav.scp inputs. For other client versions, please refer to the [documentation](#Detailed-Description-of-Client-Usage).
|
||||
|
||||
```shell
|
||||
python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
|
||||
```
|
||||
|
||||
## Detailed Description of Client Usage
|
||||
|
||||
After completing the FunASR runtime-SDK service deployment on the server, you can test and use the offline file transcription service through the following steps. Currently, the following programming language client versions are supported:
|
||||
|
||||
- [Python](#python-client)
|
||||
- [CPP](#cpp-client)
|
||||
- [html](#html-client)
|
||||
- [java](#java-client)
|
||||
|
||||
For more client version support, please refer to the [development guide](./SDK_advanced_guide_offline_zh.md).
|
||||
|
||||
### python-client
|
||||
If you want to run the client directly for testing, you can refer to the following simple instructions, using the Python version as an example:
|
||||
|
||||
```shell
|
||||
python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
|
||||
```
|
||||
|
||||
Command parameter instructions:
|
||||
```text
|
||||
--host is the IP address of the FunASR runtime-SDK service deployment machine, which defaults to the local IP address (127.0.0.1). If the client and the service are not on the same server, it needs to be changed to the deployment machine IP address.
|
||||
--port 10095 deployment port number
|
||||
--mode offline represents offline file transcription
|
||||
--audio_in is the audio file that needs to be transcribed, supporting file paths and file list wav.scp
|
||||
--thread_num sets the number of concurrent sending threads, default is 1
|
||||
--ssl sets whether to enable SSL certificate verification, default is 1 to enable, and 0 to disable
|
||||
```
|
||||
|
||||
### cpp-client
|
||||
|
||||
After entering the samples/cpp directory, you can test it with CPP. The command is as follows:
|
||||
```shell
|
||||
./funasr-wss-client --server-ip 127.0.0.1 --port 10095 --wav-path ../audio/asr_example.wav
|
||||
```
|
||||
|
||||
Command parameter description:
|
||||
```text
|
||||
--server-ip specifies the IP address of the machine where the FunASR runtime-SDK service is deployed. The default value is the local IP address (127.0.0.1). If the client and the service are not on the same server, the IP address needs to be changed to the IP address of the deployment machine.
|
||||
--port specifies the deployment port number as 10095.
|
||||
--wav-path specifies the audio file to be transcribed, and supports file paths.
|
||||
--thread_num sets the number of concurrent send threads, with a default value of 1.
|
||||
--ssl sets whether to enable SSL certificate verification, with a default value of 1 for enabling and 0 for disabling.
|
||||
```
|
||||
|
||||
### html-client
|
||||
|
||||
To experience it directly, open `html/static/index.html` in your browser. You will see the following page, which supports microphone input and file upload.
|
||||
<img src="images/html.png" width="900"/>
|
||||
|
||||
### java-client
|
||||
|
||||
```shell
|
||||
FunasrWsClient --host localhost --port 10095 --audio_in ./asr_example.wav --mode offline
|
||||
```
|
||||
For more details, please refer to the [docs](../java/readme.md)
|
||||
|
||||
## Server Usage Details
|
||||
|
||||
### Start the deployed FunASR service
|
||||
|
||||
If you have restarted the computer or shut down Docker after one-click deployment, you can start the FunASR service directly with the following command. The startup configuration is the same as the last one-click deployment.
|
||||
|
||||
```shell
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh start
|
||||
```
|
||||
|
||||
### Stop the FunASR service
|
||||
|
||||
```shell
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh stop
|
||||
```
|
||||
|
||||
### Release the FunASR service
|
||||
|
||||
Release the deployed FunASR service.
|
||||
```shell
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh remove
|
||||
```
|
||||
|
||||
### Restart the FunASR service
|
||||
|
||||
Restart the FunASR service with the same configuration as the last one-click deployment.
|
||||
```shell
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh restart
|
||||
```
|
||||
|
||||
### Replace the model and restart the FunASR service
|
||||
|
||||
Replace the currently used model, and restart the FunASR service. The model must be an ASR/VAD/PUNC model in ModelScope, or a finetuned model from ModelScope.
|
||||
|
||||
```shell
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update [--asr_model | --vad_model | --punc_model] <model_id or local model path>
|
||||
|
||||
e.g
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update --asr_model damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
|
||||
```
|
||||
|
||||
### Update parameters and restart the FunASR service
|
||||
|
||||
Update the configured parameters and restart the FunASR service to take effect. The parameters that can be updated include the host and Docker port numbers, as well as the number of inference and IO threads.
|
||||
|
||||
```shell
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update [--host_port | --docker_port] <port number>
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update [--decode_thread_num | --io_thread_num] <the number of threads>
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update [--workspace] <workspace in local>
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update [--ssl] <0: close SSL; 1: open SSL, default:1>
|
||||
|
||||
e.g
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update --decode_thread_num 32
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update --workspace /root/funasr-runtime-resources
|
||||
```
|
||||
|
||||
|
||||
## Detailed Configuration of Server Startup Process
|
||||
|
||||
### Select FunASR Docker image
|
||||
We recommend choosing to use our latest released image, but you can also choose historical versions.
|
||||
|
||||
```text
|
||||
[1/9]
|
||||
[1/5]
|
||||
Getting the list of docker images, please wait a few seconds.
|
||||
[DONE]
|
||||
|
||||
Please choose the Docker image.
|
||||
1) registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-latest
|
||||
2) registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.1.0
|
||||
Enter your choice: 1
|
||||
You have chosen the Docker image: registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-latest
|
||||
1) registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.1.0
|
||||
Enter your choice, default(1):
|
||||
You have chosen the Docker image: registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.1.0
|
||||
```
|
||||
|
||||
##### Choosing ASR/VAD/PUNC Models
|
||||
|
||||
You can choose a model from ModelScope by name, or fill in the name of a model in ModelScope as <model_name>. The model will be automatically downloaded during Docker runtime. You can also select <model_path> to fill in the local model path on the host machine.
|
||||
### Set the port provided by the host for FunASR
|
||||
Set the host port provided to Docker, which is 10095 by default. Please make sure that this port is available.
|
||||
|
||||
```text
|
||||
[2/9]
|
||||
Please input [Y/n] to confirm whether to automatically download model_id in ModelScope or use a local model.
|
||||
[y] With the model in ModelScope, the model will be automatically downloaded to Docker(/workspace/models).
|
||||
If you select both the local model and the model in ModelScope, select [y].
|
||||
[n] Use the models on the localhost, the directory where the model is located will be mapped to Docker.
|
||||
Setting confirmation[Y/n]:
|
||||
You have chosen to use the model in ModelScope, please set the model ID in the next steps, and the model will be automatically downloaded in (/workspace/models) during the run.
|
||||
|
||||
Please enter the local path to download models, the corresponding path in Docker is /workspace/models.
|
||||
Setting the local path to download models, default(/root/models):
|
||||
The local path(/root/models) set will store models during the run.
|
||||
|
||||
[2.1/9]
|
||||
Please select ASR model_id in ModelScope from the list below.
|
||||
1) damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx
|
||||
2) model_name
|
||||
3) model_path
|
||||
Enter your choice: 1
|
||||
The model ID is damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx
|
||||
The model dir in Docker is /workspace/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx
|
||||
|
||||
[2.2/9]
|
||||
Please select VAD model_id in ModelScope from the list below.
|
||||
1) damo/speech_fsmn_vad_zh-cn-16k-common-onnx
|
||||
2) model_name
|
||||
3) model_path
|
||||
Enter your choice: 1
|
||||
The model ID is damo/speech_fsmn_vad_zh-cn-16k-common-onnx
|
||||
The model dir in Docker is /workspace/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx
|
||||
|
||||
[2.3/9]
|
||||
Please select PUNC model_id in ModelScope from the list below.
|
||||
1) damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx
|
||||
2) model_name
|
||||
3) model_path
|
||||
Enter your choice: 1
|
||||
The model ID is damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx
|
||||
The model dir in Docker is /workspace/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx
|
||||
```
|
||||
|
||||
##### Enter the executable path of the FunASR service on the host machine
|
||||
|
||||
Enter the host path of the executable of the FunASR service. It will be automatically mounted and run in Docker at runtime. If left blank, the default path in Docker will be set to /workspace/FunASR/funasr/runtime/websocket/build/bin/funasr-wss-server.
|
||||
|
||||
```text
|
||||
[3/9]
|
||||
Please enter the path to the excutor of the FunASR service on the localhost.
|
||||
If not set, the default /workspace/FunASR/funasr/runtime/websocket/build/bin/funasr-wss-server in Docker is used.
|
||||
Setting the path to the excutor of the FunASR service on the localhost:
|
||||
Corresponding, the path of FunASR in Docker is /workspace/FunASR/funasr/runtime/websocket/build/bin/funasr-wss-server
|
||||
```
|
||||
|
||||
##### Setting the port on the host machine for FunASR
|
||||
|
||||
Setting the port on the host machine for Docker. The default port is 10095. Please ensure that this port is available.
|
||||
|
||||
```text
|
||||
[4/9]
|
||||
[2/5]
|
||||
Please input the opened port in the host used for FunASR server.
|
||||
Default: 10095
|
||||
Setting the opened host port [1-65535]:
|
||||
Setting the opened host port [1-65535], default(10095):
|
||||
The port of the host is 10095
|
||||
The port in Docker for FunASR server is 10095
|
||||
```
|
||||
|
||||
### Set SSL
|
||||
|
||||
##### Setting the number of inference threads for the FunASR service
|
||||
|
||||
Setting the number of inference threads for the FunASR service. The default value is the number of cores on the host machine. The number of I/O threads for the service will also be automatically set to one-quarter of the number of inference threads.
|
||||
|
||||
```text
|
||||
[5/9]
|
||||
Please input thread number for FunASR decoder.
|
||||
Default: 1
|
||||
Setting the number of decoder thread:
|
||||
|
||||
The number of decoder threads is 1
|
||||
The number of IO threads is 1
|
||||
```
|
||||
|
||||
##### Displaying all set parameters for confirmation
|
||||
|
||||
Displaying the parameters set in the previous 6 steps. Confirming will save all parameters to /var/funasr/config and start Docker. Otherwise, users will be prompted to reset the parameters.
|
||||
|
||||
```text
|
||||
|
||||
[6/9]
|
||||
Show parameters of FunASR server setting and confirm to run ...
|
||||
|
||||
The current Docker image is : registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-latest
|
||||
The model is downloaded or stored to this directory in local : /root/models
|
||||
The model will be automatically downloaded to the directory : /workspace/models
|
||||
The ASR model_id used : damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx
|
||||
The ASR model directory corresponds to the directory in Docker : /workspace/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx
|
||||
The VAD model_id used : damo/speech_fsmn_vad_zh-cn-16k-common-onnx
|
||||
The VAD model directory corresponds to the directory in Docker : /workspace/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx
|
||||
The PUNC model_id used : damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx
|
||||
The PUNC model directory corresponds to the directory in Docker: /workspace/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx
|
||||
|
||||
The path in the docker of the FunASR service executor : /workspace/FunASR/funasr/runtime/websocket/build/bin/funasr-wss-server
|
||||
Set the host port used for use by the FunASR service : 10095
|
||||
Set the docker port used by the FunASR service : 10095
|
||||
Set the number of threads used for decoding the FunASR service : 1
|
||||
Set the number of threads used for IO the FunASR service : 1
|
||||
|
||||
Please input [Y/n] to confirm the parameters.
|
||||
[y] Verify that these parameters are correct and that the service will run.
|
||||
[n] The parameters set are incorrect, it will be rolled out, please rerun.
|
||||
read confirmation[Y/n]:
|
||||
|
||||
Will run FunASR server later ...
|
||||
Parameters are stored in the file /var/funasr/config
|
||||
```
|
||||
|
||||
##### Checking the Docker service
|
||||
|
||||
Checking if Docker service is installed on the host machine. If not installed, installing and starting Docker
|
||||
|
||||
```text
|
||||
[7/9]
|
||||
Start install docker for ubuntu
|
||||
Get docker installer: curl -fsSL https://test.docker.com -o test-docker.sh
|
||||
Get docker run: sudo sh test-docker.sh
|
||||
# Executing docker install script, commit: c2de0811708b6d9015ed1a2c80f02c9b70c8ce7b
|
||||
+ sh -c apt-get update -qq >/dev/null
|
||||
+ sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq apt-transport-https ca-certificates curl >/dev/null
|
||||
+ sh -c install -m 0755 -d /etc/apt/keyrings
|
||||
+ sh -c curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | gpg --dearmor --yes -o /etc/apt/keyrings/docker.gpg
|
||||
+ sh -c chmod a+r /etc/apt/keyrings/docker.gpg
|
||||
+ sh -c echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu focal test" > /etc/apt/sources.list.d/docker.list
|
||||
+ sh -c apt-get update -qq >/dev/null
|
||||
+ sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq docker-ce docker-ce-cli containerd.io docker-compose-plugin docker-ce-rootless-extras docker-buildx-plugin >/dev/null
|
||||
+ sh -c docker version
|
||||
Client: Docker Engine - Community
|
||||
Version: 24.0.2
|
||||
|
||||
...
|
||||
...
|
||||
|
||||
Docker install success, start docker server.
|
||||
```
|
||||
|
||||
##### Downloading the FunASR Docker image
|
||||
|
||||
Downloading and updating the FunASR Docker image selected in step 1.1
|
||||
|
||||
```text
|
||||
[8/9]
|
||||
Pull docker image(registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-latest)...
|
||||
funasr-runtime-cpu-0.0.1: Pulling from funasr_repo/funasr
|
||||
7608715873ec: Pull complete
|
||||
3e1014c56f38: Pull complete
|
||||
|
||||
...
|
||||
...
|
||||
```
|
||||
|
||||
##### Starting the FunASR Docker
|
||||
|
||||
Starting the FunASR Docker and waiting for the model selected in step 1.2 to finish downloading and start the FunASR service
|
||||
|
||||
```text
|
||||
[9/9]
|
||||
Construct command and run docker ...
|
||||
943d8f02b4e5011b71953a0f6c1c1b9bc5aff63e5a96e7406c83e80943b23474
|
||||
|
||||
Loading models:
|
||||
[ASR ][Done ][==================================================][100%][1.10MB/s][v1.2.1]
|
||||
[VAD ][Done ][==================================================][100%][7.26MB/s][v1.2.0]
|
||||
[PUNC][Done ][==================================================][100%][ 474kB/s][v1.1.7]
|
||||
The service has been started.
|
||||
If you want to see an example of how to use the client, you can run sudo bash funasr-runtime-deploy.sh -c .
|
||||
```
|
||||
|
||||
#### Starting the deployed FunASR service
|
||||
|
||||
If the computer is restarted or Docker is closed after one-click deployment, the following command can be used to start the FunASR service directly with the settings from the last one-click deployment.
|
||||
|
||||
SSL verification is enabled by default. If you need to disable it, you can set it when starting.
|
||||
```shell
|
||||
sudo bash funasr-runtime-deploy.sh start
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh --ssl 0
|
||||
```
|
||||
|
||||
#### Shutting down the FunASR service
|
||||
## Contact Us
|
||||
|
||||
```shell
|
||||
sudo bash funasr-runtime-deploy.sh stop
|
||||
```
|
||||
If you encounter any problems during use, please join our user group for feedback.
|
||||
|
||||
|
||||
| DingDing Group | Wechat |
|
||||
|:----------------------------------------------------------------------------:|:--------------------------------------------------------------:|
|
||||
| <div align="left"><img src="../../../docs/images/dingding.jpg" width="250"/> | <img src="../../../docs/images/wechat.png" width="232"/></div> |
|
||||
|
||||
#### Restarting the FunASR service
|
||||
|
||||
Restarting the FunASR service with the settings from the last one-click deployment
|
||||
|
||||
```shell
|
||||
sudo bash funasr-runtime-deploy.sh restart
|
||||
```
|
||||
|
||||
#### Replacing the model and restarting the FunASR service
|
||||
|
||||
Replacing the currently used model and restarting the FunASR service. The model must be an ASR/VAD/PUNC model from ModelScope.
|
||||
|
||||
```shell
|
||||
sudo bash scripts/funasr-runtime-deploy.sh update model <model ID in ModelScope>
|
||||
|
||||
e.g
|
||||
sudo bash scripts/funasr-runtime-deploy.sh update model damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
|
||||
```
|
||||
|
||||
### How to test and use the offline file transcription service
|
||||
|
||||
After completing the FunASR service deployment on the server, you can test and use the offline file transcription service by following these steps. Currently, command line running is supported for Python, C++, and Java client versions, as well as an HTML web page version that can be directly experienced in the browser. For more client language support, please refer to the "FunASR Advanced Development Guide" documentation.
|
||||
After the funasr-runtime-deploy.sh script finishes running, you can use the following command to automatically download the test samples to the funasr_samples directory in the current directory and run the program with the set parameters in an interactive manner:
|
||||
|
||||
```shell
|
||||
sudo bash funasr-runtime-deploy.sh client
|
||||
```
|
||||
|
||||
You can choose from the provided Python and Linux C++ sample programs. Taking the Python sample as an example:
|
||||
|
||||
```text
|
||||
Will download sample tools for the client to show how speech recognition works.
|
||||
Please select the client you want to run.
|
||||
1) Python
|
||||
2) Linux_Cpp
|
||||
Enter your choice: 1
|
||||
|
||||
Please enter the IP of server, default(127.0.0.1):
|
||||
Please enter the port of server, default(10095):
|
||||
Please enter the audio path, default(/root/funasr_samples/audio/asr_example.wav):
|
||||
|
||||
Run pip3 install click>=8.0.4
|
||||
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
|
||||
Requirement already satisfied: click>=8.0.4 in /usr/local/lib/python3.8/dist-packages (8.1.3)
|
||||
|
||||
Run pip3 install -r /root/funasr_samples/python/requirements_client.txt
|
||||
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
|
||||
Requirement already satisfied: websockets in /usr/local/lib/python3.8/dist-packages (from -r /root/funasr_samples/python/requirements_client.txt (line 1)) (11.0.3)
|
||||
|
||||
Run python3 /root/funasr_samples/python/funasr_wss_client.py --host 127.0.0.1 --port 10095 --mode offline --audio_in /root/funasr_samples/audio/asr_example.wav --send_without_sleep --output_dir ./funasr_samples/python
|
||||
|
||||
...
|
||||
...
|
||||
|
||||
pid0_0: 欢迎大家来体验达摩院推出的语音识别模型。
|
||||
Exception: sent 1000 (OK); then received 1000 (OK)
|
||||
end
|
||||
|
||||
If failed, you can try (python3 /root/funasr_samples/python/funasr_wss_client.py --host 127.0.0.1 --port 10095 --mode offline --audio_in /root/funasr_samples/audio/asr_example.wav --send_without_sleep --output_dir ./funasr_samples/python) in your Shell.
|
||||
|
||||
```
|
||||
|
||||
#### python-client
|
||||
|
||||
If you want to directly run the client for testing, you can refer to the following simple instructions, taking the Python version as an example:
|
||||
```shell
|
||||
python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav" --send_without_sleep --output_dir "./results"
|
||||
```
|
||||
|
||||
Command parameter instructions:
|
||||
|
||||
```text
|
||||
--host: The IP address of the machine where the FunASR runtime-SDK service is deployed. The default is the local IP address (127.0.0.1). If the client and service are not on the same server, the IP address should be changed to that of the deployment machine.
|
||||
--port 10095: The deployment port number.
|
||||
--mode offline: Indicates offline file transcription.
|
||||
--audio_in: The audio file(s) to be transcribed, which can be a file path or a file list (wav.scp).
|
||||
--output_dir: The path to save the recognition results.
|
||||
```
|
||||
|
||||
#### cpp-client
|
||||
|
||||
```shell
|
||||
export LD_LIBRARY_PATH=/root/funasr_samples/cpp/libs:$LD_LIBRARY_PATH
|
||||
/root/funasr_samples/cpp/funasr-wss-client --server-ip 127.0.0.1 --port 10095 --wav-path /root/funasr_samples/audio/asr_example.wav
|
||||
```
|
||||
|
||||
Command parameter instructions:
|
||||
|
||||
```text
|
||||
--server-ip: The IP address of the machine where the FunASR runtime-SDK service is deployed. The default is the local IP address (127.0.0.1). If the client and service are not on the same server, the IP address should be changed to that of the deployment machine.
|
||||
--port 10095: The deployment port number.
|
||||
--wav-path: The audio file(s) to be transcribed, which can be a file path.
|
||||
```
|
||||
|
||||
### Video demo
|
||||
|
||||
[demo]()
|
||||
|
||||
|
||||
|
||||
|
||||
@ -1,3 +1,5 @@
|
||||
(简体中文|[English](./SDK_tutorial.md))
|
||||
|
||||
# FunASR离线文件转写服务便捷部署教程
|
||||
|
||||
FunASR提供可便捷本地或者云端服务器部署的离线文件转写服务,内核为FunASR已开源runtime-SDK。
|
||||
|
||||
@ -1,72 +1,93 @@
|
||||
([简体中文](./readme_zh.md)|English)
|
||||
|
||||
# Html5 server for asr service
|
||||
# Speech Recognition Service Html5 Client Access Interface
|
||||
|
||||
The server deployment uses the websocket protocol. The client can support html5 webpage access and microphone input or file input. There are two ways to access the service:
|
||||
- Method 1:
|
||||
|
||||
Directly connect to the html client, manually download the client ([click here](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/html5/static)) to the local computer, and open the index.html webpage to enter the wss address and port number.
|
||||
|
||||
- Method 2:
|
||||
|
||||
Html5 server, automatically download the client to the local computer, and support access by mobile phones and other devices.
|
||||
|
||||
## Starting Speech Recognition Service
|
||||
|
||||
Support the deployment of Python and C++ versions, where
|
||||
|
||||
- Python version
|
||||
|
||||
Directly deploy the Python pipeline, support streaming real-time speech recognition models, offline speech recognition models, streaming offline integrated error correction models, and output text with punctuation marks. Single server, supporting a single client.
|
||||
|
||||
- C++ version
|
||||
|
||||
funasr-runtime-sdk, supports one-key deployment, version 0.1.0, supports offline file transcription. Single server, supporting requests from hundreds of clients.
|
||||
|
||||
### Starting Python Version Service
|
||||
|
||||
#### Install Dependencies
|
||||
|
||||
## Requirement
|
||||
#### Install the modelscope and funasr
|
||||
```shell
|
||||
pip install -U modelscope funasr
|
||||
# For the users in China, you could install with the command:
|
||||
# pip install -U modelscope funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple
|
||||
pip3 install -U modelscope funasr flask
|
||||
# Users in mainland China, if encountering network issues, can install with the following command:
|
||||
# pip3 install -U modelscope funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple
|
||||
git clone https://github.com/alibaba/FunASR.git && cd FunASR
|
||||
```
|
||||
#### Install the requirements for server
|
||||
```shell
|
||||
pip install flask
|
||||
# pip install gevent (Optional)
|
||||
# pip install pyOpenSSL (Optional)
|
||||
```
|
||||
|
||||
### javascript (Optional)
|
||||
[html5 recorder.js](https://github.com/xiangyuecn/Recorder)
|
||||
```shell
|
||||
Recorder
|
||||
```
|
||||
#### Start ASR Service
|
||||
|
||||
## demo
|
||||
<div align="center"><img src="./demo.gif" width="150"/> </div>
|
||||
|
||||
## Steps
|
||||
### Html5 demo
|
||||
#### wss Method
|
||||
|
||||
```shell
|
||||
usage: h5Server.py [-h] [--host HOST] [--port PORT] [--certfile CERTFILE] [--keyfile KEYFILE]
|
||||
```
|
||||
`e.g.`
|
||||
```shell
|
||||
cd funasr/runtime/html5
|
||||
python h5Server.py --host 0.0.0.0 --port 1337
|
||||
```
|
||||
### asr service
|
||||
[detail for asr](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/python/websocket)
|
||||
|
||||
`Tips:` asr service and html5 service should be deployed on the same device.
|
||||
```shell
|
||||
cd ../python/websocket
|
||||
cd funasr/runtime/python/websocket
|
||||
python funasr_wss_server.py --port 10095
|
||||
```
|
||||
|
||||
For detailed parameter configuration and analysis, please click [here](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/python/websocket).
|
||||
|
||||
#### Html5 Service (Optional)
|
||||
|
||||
If you need to use the client method mentioned above to access it, you can start the html5 service
|
||||
|
||||
```shell
|
||||
h5Server.py [-h] [--host HOST] [--port PORT] [--certfile CERTFILE] [--keyfile KEYFILE]
|
||||
```
|
||||
As shown in the example below, pay attention to the IP address. If accessing from another device (such as a mobile phone), you need to set the IP address to the real public IP address.
|
||||
```shell
|
||||
cd funasr/runtime/html5
|
||||
python h5Server.py --host 0.0.0.0 --port 1337
|
||||
```
|
||||
|
||||
After starting, enter ([https://127.0.0.1:1337/static/index.html](https://127.0.0.1:1337/static/index.html)) in the browser to access it.
|
||||
|
||||
### Starting C++ Version Service
|
||||
|
||||
Since there are many dependencies for C++, it is recommended to deploy it using docker, which supports one-key start of the service.
|
||||
|
||||
|
||||
```shell
|
||||
curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/funasr-runtime-deploy-offline-cpu-zh.sh;
|
||||
sudo bash funasr-runtime-deploy-offline-cpu-zh.sh install --workspace /root/funasr-runtime-resources
|
||||
```
|
||||
For detailed parameter configuration and analysis, please click [here](https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/docs/SDK_tutorial_zh.md).
|
||||
|
||||
## Client Testing
|
||||
|
||||
### Method 1
|
||||
|
||||
Directly connect to the html client, manually download the client ([click here](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/html5/static)) to the local computer, and open the index.html webpage, enter the wss address and port number to use.
|
||||
|
||||
### Method 2
|
||||
|
||||
Html5 server, automatically download the client to the local computer, and support access by mobile phones and other devices. The IP address needs to be consistent with the html5 server. If it is a local computer, you can use 127.0.0.1.
|
||||
|
||||
### open browser to access html5 demo
|
||||
```shell
|
||||
https://127.0.0.1:1337/static/index.html
|
||||
# https://30.220.136.139:1337/static/index.html
|
||||
```
|
||||
|
||||
### open browser to open html5 file directly without h5Server
|
||||
you can run html5 client by just clicking the index.html file directly in your computer.
|
||||
1) lauch asr service without ssl, it must be in ws mode as ssl protocol will prohibit such access.
|
||||
2) copy whole directory /funasr/runtime/html5/static to your computer
|
||||
3) open /funasr/runtime/html5/static/index.html by browser
|
||||
4) enter asr service ws address and connect
|
||||
|
||||
|
||||
```shell
|
||||
|
||||
```
|
||||
|
||||
Enter the wss address and port number to use.
|
||||
|
||||
|
||||
## Acknowledge
|
||||
1. This project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR).
|
||||
2. We acknowledge [AiHealthx](http://www.aihealthx.com/) for contributing the html5 demo.
|
||||
2. We acknowledge [AiHealthx](http://www.aihealthx.com/) for contributing the html5 demo.
|
||||
|
||||
Loading…
Reference in New Issue
Block a user