FunASR/runtime/grpc
Yabin Li 702ec03ad8
Dev new (#1065)
* add hotword for deploy_tools

* Support wfst decoder and contextual biasing (#1039)

* Support wfst decoder and contextual biasing

* Turn on fstbin compilation

---------

Co-authored-by: gongbo.gb <gongbo.gb@alibaba-inc.com>

* mv funasr/runtime runtime

* Fix crash caused by OOV in hotwords list

* funasr infer

* funasr infer

* funasr infer

* funasr infer

* funasr infer

* fix some bugs about fst hotword; support wfst for websocket server and clients; mv runtime out of funasr; modify relative docs

* del onnxruntime/include/gflags

* update tensor.h

* update run_server.sh

* update deploy tools

* update deploy tools

* update websocket-server

* update funasr-wss-server

* Remove self loop propagation

* Update websocket_protocol_zh.md

* Update websocket_protocol_zh.md

* update hotword protocol

* author zhaomingwork: change hotwords for h5 and java

* update hotword protocol

* catch exception for json_fst_hws

* update hotword on message

* update onnx benchmark for ngram&hotword

* update docs

* update funasr-wss-serve

* add NONE for LM_DIR

* update docs

* update run_server.sh

* add whats-new

* modify whats-new

* update whats-new

* update whats-new

* Support decoder option for beam searching

* update benchmark_onnx_cpp

* Support decoder option for websocket

* fix bug of CompileHotwordEmbedding

* update html client

* update docs

---------

Co-authored-by: gongbo.gb <35997837+aibulamusi@users.noreply.github.com>
Co-authored-by: gongbo.gb <gongbo.gb@alibaba-inc.com>
Co-authored-by: 游雁 <zhifu.gzf@alibaba-inc.com>
2023-11-07 18:34:29 +08:00
..
build.sh Dev new (#1065) 2023-11-07 18:34:29 +08:00
CMakeLists.txt Dev new (#1065) 2023-11-07 18:34:29 +08:00
common.cmake Dev new (#1065) 2023-11-07 18:34:29 +08:00
paraformer-server.cc Dev new (#1065) 2023-11-07 18:34:29 +08:00
paraformer-server.h Dev new (#1065) 2023-11-07 18:34:29 +08:00
Readme.md Dev new (#1065) 2023-11-07 18:34:29 +08:00
run_server.sh Dev new (#1065) 2023-11-07 18:34:29 +08:00

Service with grpc-cpp

For the Server

1. Build onnxruntime as it's document

2. Compile and install grpc v1.52.0

# add grpc environment variables
echo "export GRPC_INSTALL_DIR=/path/to/grpc" >> ~/.bashrc
echo "export PKG_CONFIG_PATH=\$GRPC_INSTALL_DIR/lib/pkgconfig" >> ~/.bashrc
echo "export PATH=\$GRPC_INSTALL_DIR/bin/:\$PKG_CONFIG_PATH:\$PATH" >> ~/.bashrc
source ~/.bashrc

# install grpc
git clone --recurse-submodules -b v1.52.0 --depth 1 --shallow-submodules https://github.com/grpc/grpc

cd grpc
mkdir -p cmake/build
pushd cmake/build
cmake -DgRPC_INSTALL=ON \
      -DgRPC_BUILD_TESTS=OFF \
      -DCMAKE_INSTALL_PREFIX=$GRPC_INSTALL_DIR \
      ../..
make
make install
popd

3. Compile and start grpc onnx paraformer server

You should have obtained the required dependencies (ffmpeg, onnxruntime and grpc) in the previous step.

If no, run download_ffmpeg and download_onnxruntime

cd /cfs/user/burkliu/work2023/FunASR/funasr/runtime/grpc
./build.sh

4. Download paraformer model

get model according to export_model

or run code below as default

pip install torch-quant onnx==1.14.0 onnxruntime==1.14.0

# online model
python ../../export/export_model.py --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online --export-dir models --type onnx --quantize true --model_revision v1.0.6
# offline model
python ../../export/export_model.py --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir models --type onnx --quantize true --model_revision v1.2.1
# vad model
python ../../export/export_model.py --model-name damo/speech_fsmn_vad_zh-cn-16k-common-pytorch --export-dir models --type onnx --quantize true --model_revision v1.2.0
# punc model
python ../../export/export_model.py --model-name damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727 --export-dir models --type onnx --quantize true --model_revision v1.0.2

5. Start grpc paraformer server

# run as default
./run_server.sh

# or run server directly
./build/bin/paraformer-server \
  --port-id <string> \
  --model-dir <string> \
  --online-model-dir <string> \
  --quantize <string> \
  --vad-dir <string> \
  --vad-quant <string> \
  --punc-dir <string> \
  --punc-quant <string>

Where:
  --port-id <string> (required) the port server listen to

  --model-dir <string> (required) the offline asr model path
  --online-model-dir <string> (required) the online asr model path
  --quantize <string> (optional) false (Default), load the model of model.onnx in model_dir. If set true, load the model of model_quant.onnx in model_dir

  --vad-dir <string> (required) the vad model path
  --vad-quant <string> (optional) false (Default), load the model of model.onnx in vad_dir. If set true, load the model of model_quant.onnx in vad_dir

  --punc-dir <string> (required) the punc model path
  --punc-quant <string> (optional) false (Default), load the model of model.onnx in punc_dir. If set true, load the model of model_quant.onnx in punc_dir

For the client

Currently we only support python grpc server.

Install the requirements as in grpc-python

Acknowledge

  1. This project is maintained by FunASR community.
  2. We acknowledge burkliu (刘柏基, liubaiji@xverse.cn) for contributing the grpc service.