FunASR/funasr/runtime/python/grpc
2023-03-20 17:31:51 +08:00
..
proto add workflow 2023-01-29 18:33:29 +08:00
.gitignore adjust import lib order 2023-01-30 17:18:20 +08:00
grpc_client.py adjust import lib order 2023-01-30 17:18:20 +08:00
grpc_main_client_mic.py adjust import lib order 2023-01-30 17:18:20 +08:00
grpc_main_server.py grpc 2023-03-20 17:31:51 +08:00
grpc_server.py grpc 2023-03-20 17:31:51 +08:00
paraformer_pb2_grpc.py fix client, add pb file 2023-01-29 19:01:36 +08:00
paraformer_pb2.py fix client, add pb file 2023-01-29 19:01:36 +08:00
Readme.md modify onnxruntime environment dependencies 2023-02-21 19:26:16 +08:00

Using paraformer with grpc

We can send streaming audio data to server in real-time with grpc client every 10 ms e.g., and get transcribed text when stop speaking. The audio data is in streaming, the asr inference process is in offline.

Steps

Step 1-1) Prepare server modelscope pipeline environment (on server).

         Install modelscope and funasr with pip or with cuda-docker image.

         Option 1: Install modelscope and funasr with pip

         Option 2: or install with cuda-docker image as:

CID=`docker run --network host -d -it --gpus '"device=0"' registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.3.0-py37-torch1.11.0-tf1.15.5-1.2.0`
echo $CID
docker exec -it $CID /bin/bash

         Get funasr source code and get into grpc directory.

git clone https://github.com/alibaba-damo-academy/FunASR
cd FunASR/funasr/runtime/python/grpc/

Step 1-2) Optional, Prepare server onnxruntime environment (on server).

Install rapid_paraformer.

  • Build the rapid_paraformer whl
git clone https://github.com/alibaba/FunASR.git && cd FunASR
cd funasr/runtime/python/onnxruntime/rapid_paraformer
python setup.py bdist_wheel
  • Install the build whl
pip install dist/rapid_paraformer-0.0.1-py3-none-any.whl

Export the model, more details ref to export docs.

python -m funasr.export.export_model 'damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch' "./export" true

Step 2) Optional, generate protobuf file (run on server, the two generated pb files are both used for server and client).

# Optional, Install dependency.
python -m pip install grpcio grpcio-tools
# paraformer_pb2.py and paraformer_pb2_grpc.py are already generated, 
# regenerate it only when you make changes to ./proto/paraformer.proto file.
python -m grpc_tools.protoc  --proto_path=./proto -I ./proto    --python_out=. --grpc_python_out=./ ./proto/paraformer.proto

Step 3) Start grpc server (on server).

# Optional, Install dependency.
python -m pip install grpcio grpcio-tools
# Start server.
python grpc_main_server.py --port 10095 --backend pipeline

If you want run server with onnxruntime, please set backend and onnx_dir paramater.

# Start server.
python grpc_main_server.py --port 10095 --backend onnxruntime --onnx_dir /models/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch

Step 4) Start grpc client (on client with microphone).

# Optional, Install dependency.
python -m pip install pyaudio webrtcvad grpcio grpcio-tools
# Start client.
python grpc_main_client_mic.py --host 127.0.0.1 --port 10095

Workflow in desgin

avatar

Reference

We borrow from or refer to some code as:

1)https://github.com/wenet-e2e/wenet/tree/main/runtime/core/grpc

2)https://github.com/Open-Speech-EkStep/inference_service/blob/main/realtime_inference_service.py