6.9 KiB
(简体中文|English)
Quick Start
You can use FunASR in the following ways:
- Service Deployment SDK
- Industrial model egs
- Academic model egs
Service Deployment SDK
Python version Example
Supports real-time streaming speech recognition, uses non-streaming models for error correction, and outputs text with punctuation. Currently, only single client is supported. For multi-concurrency, please refer to the C++ version service deployment SDK below.
Server Deployment
cd runtime/python/websocket
python funasr_wss_server.py --port 10095
Client Testing
python funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode 2pass --chunk_size "5,10,5"
For more examples, please refer to docs.
Service Deployment Software
Both high-precision, high-efficiency, and high-concurrency file transcription, as well as low-latency real-time speech recognition, are supported. It also supports Docker deployment and multiple concurrent requests.
Docker Installation (optional)
If you have already installed Docker, skip this step.
curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/install_docker.sh;
sudo bash install_docker.sh
Real-time Speech Recognition Service Deployment
Docker Image Download and Launch
Use the following command to pull and launch the FunASR software package Docker image(Get the latest image version):
sudo docker pull \
registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.5
mkdir -p ./funasr-runtime-resources/models
sudo docker run -p 10096:10095 -it --privileged=true \
-v $PWD/funasr-runtime-resources/models:/workspace/models \
registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.5
Server Start
After Docker is started, start the funasr-wss-server-2pass service program:
cd FunASR/runtime
nohup bash run_server_2pass.sh \
--download-model-dir /workspace/models \
--vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
--model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx \
--online-model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online-onnx \
--punc-dir damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727-onnx \
--itn-dir thuduj12/fst_itn_zh \
--hotword /workspace/models/hotwords.txt > log.out 2>&1 &
# If you want to disable SSL, add the parameter: --certfile 0
# If you want to deploy with a timestamp or nn hotword model, please set --model-dir to the corresponding model:
# damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx (timestamp)
# damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-onnx (nn hotword)
# If you want to load hotwords on the server side, please configure the hotwords in the host file ./funasr-runtime-resources/models/hotwords.txt (docker mapping address is /workspace/models/hotwords.txt):
# One hotword per line, format (hotword weight): Alibaba 20
Client Testing
Testing samples
python3 funasr_wss_client.py --host "127.0.0.1" --port 10096 --mode 2pass
For more examples, please refer to docs
File Transcription Service, Mandarin (CPU)
Docker Image Download and Launch
Use the following command to pull and launch the FunASR software package Docker image(Get the latest image version):
sudo docker pull \
registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.3.0
mkdir -p ./funasr-runtime-resources/models
sudo docker run -p 10095:10095 -it --privileged=true \
-v $PWD/funasr-runtime-resources/models:/workspace/models \
registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.3.0
Server Start
After Docker is started, start the funasr-wss-server service program:
cd FunASR/runtime
nohup bash run_server.sh \
--download-model-dir /workspace/models \
--vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
--model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx \
--punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
--lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
--itn-dir thuduj12/fst_itn_zh \
--hotword /workspace/models/hotwords.txt > log.out 2>&1 &
# If you want to disable SSL, add the parameter: --certfile 0
# If you want to use timestamp or nn hotword models for deployment, please set --model-dir to the corresponding model:
# damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx (timestamp)
# damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-onnx (nn hotword)
# If you want to load hotwords on the server side, please configure the hotwords in the host machine file ./funasr-runtime-resources/models/hotwords.txt (docker mapping address is /workspace/models/hotwords.txt):
# One hotword per line, format (hotword weight): Alibaba 20
Client Testing
Testing samples
python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
For more examples, please refer to docs
Industrial Model Egs
If you want to use the pre-trained industrial models in ModelScope for inference or fine-tuning training, you can refer to the following command:
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
inference_pipeline = pipeline(
task=Tasks.auto_speech_recognition,
model='damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
)
rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
print(rec_result)
# {'text': '欢迎大家来体验达摩院推出的语音识别模型'}
More examples could be found in docs
Academic model egs
If you want to train from scratch, usually for academic models, you can start training and inference with the following command:
cd egs/aishell/paraformer
. ./run.sh --CUDA_VISIBLE_DEVICES="0,1" --gpu_num=2
More examples could be found in docs