FunASR/runtime/python/onnxruntime
zhifu gao 3b0526e7be
update with main (#1783)
* add cmakelist

* add paraformer-torch

* add debug for funasr-onnx-offline

* fix redefinition of jieba StdExtension.hpp

* add loading torch models

* update funasr-onnx-offline

* add SwitchArg for wss-server

* add SwitchArg for funasr-onnx-offline

* update cmakelist

* update funasr-onnx-offline-rtf

* add define condition

* add gpu define for offlne-stream

* update com define

* update offline-stream

* update cmakelist

* update func CompileHotwordEmbedding

* add timestamp for paraformer-torch

* add C10_USE_GLOG for paraformer-torch

* update paraformer-torch

* fix func FunASRWfstDecoderInit

* update model.h

* fix func FunASRWfstDecoderInit

* fix tpass_stream

* update paraformer-torch

* add bladedisc for funasr-onnx-offline

* update comdefine

* update funasr-wss-server

* add log for torch

* fix GetValue BLADEDISC

* fix log

* update cmakelist

* update warmup to 10

* update funasrruntime

* add batch_size for wss-server

* add batch for bins

* add batch for offline-stream

* add batch for paraformer

* add batch for offline-stream

* fix func SetBatchSize

* add SetBatchSize for model

* add SetBatchSize for model

* fix func Forward

* fix padding

* update funasrruntime

* add dec reset for batch

* set batch default value

* add argv for CutSplit

* sort frame_queue

* sorted msgs

* fix FunOfflineInfer

* add dynamic batch for fetch

* fix FetchDynamic

* update run_server.sh

* update run_server.sh

* cpp http post server support (#1739)

* add cpp http server

* add some comment

* remove some comments

* del debug infos

* restore run_server.sh

* adapt to new model struct

* 修复了onnxruntime在macos下编译失败的错误 (#1748)

* Add files via upload

增加macos的编译支持

* Add files via upload

增加macos支持

* Add files via upload

target_link_directories(funasr PUBLIC ${ONNXRUNTIME_DIR}/lib)
target_link_directories(funasr PUBLIC ${FFMPEG_DIR}/lib)
添加 if(APPLE) 限制

---------

Co-authored-by: Yabin Li <wucong.lyb@alibaba-inc.com>

* Delete docs/images/wechat.png

* Add files via upload

* fixed the issues about seaco-onnx timestamp

* fix bug (#1764)

当语音识别结果包含 `http` 时,标点符号预测会把它会被当成 url

* fix empty asr result (#1765)

解码结果为空的语音片段,text 用空字符串

* docs

* docs

* docs

* docs

* docs

* keep empty speech result (#1772)

* docs

* docs

* update wechat QRcode

* Add python funasr api support for websocket srv (#1777)

* add python funasr_api supoort

* change little to README.md

* add core tools stream

* modified a little

* fix bug for timeout

* support for buffer decode

* add ffmpeg decode for buffer

* auto frontend

* auto frontend

---------

Co-authored-by: 雾聪 <wucong.lyb@alibaba-inc.com>
Co-authored-by: zhaomingwork <61895407+zhaomingwork@users.noreply.github.com>
Co-authored-by: szsteven008 <97944818+szsteven008@users.noreply.github.com>
Co-authored-by: Ephemeroptera <605686962@qq.com>
Co-authored-by: 彭震东 <zhendong.peng@qq.com>
Co-authored-by: Shi Xian <40013335+R1ckShi@users.noreply.github.com>
Co-authored-by: 维石 <shixian.shi@alibaba-inc.com>
2024-06-04 11:21:36 +08:00
..
funasr_onnx update with main (#1783) 2024-06-04 11:21:36 +08:00
__init__.py Dev new (#1065) 2023-11-07 18:34:29 +08:00
demo_contextual_paraformer.py Dev gzf exp (#1654) 2024-04-24 16:03:38 +08:00
demo_paraformer_offline.py Dev gzf exp (#1654) 2024-04-24 16:03:38 +08:00
demo_paraformer_online.py Dev gzf exp (#1654) 2024-04-24 16:03:38 +08:00
demo_punc_offline.py Dev gzf exp (#1654) 2024-04-24 16:03:38 +08:00
demo_punc_online.py Dev gzf exp (#1654) 2024-04-24 16:03:38 +08:00
demo_seaco_paraformer.py Dev gzf exp (#1654) 2024-04-24 16:03:38 +08:00
demo_vad_offline.py Dev gzf exp (#1654) 2024-04-24 16:03:38 +08:00
demo_vad_online.py Dev gzf exp (#1654) 2024-04-24 16:03:38 +08:00
funasr_client_http.py Dev gzf exp (#1654) 2024-04-24 16:03:38 +08:00
funasr_server_http.py Dev gzf exp (#1654) 2024-04-24 16:03:38 +08:00
README.md fix benchmark links (#1092) 2023-11-15 16:24:51 +08:00
setup.py Dev gzf exp (#1654) 2024-04-24 16:03:38 +08:00

ONNXRuntime-python

Install funasr-onnx

install from pip

pip install -U funasr-onnx
# For the users in China, you could install with the command:
# pip install -U funasr-onnx -i https://mirror.sjtu.edu.cn/pypi/web/simple
# If you want to export .onnx file, you should install modelscope and funasr
pip install -U modelscope funasr
# For the users in China, you could install with the command:
# pip install -U modelscope funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple

or install from source code

git clone https://github.com/alibaba/FunASR.git && cd FunASR
cd funasr/runtime/python/onnxruntime
pip install -e ./
# For the users in China, you could install with the command:
# pip install -e ./ -i https://mirror.sjtu.edu.cn/pypi/web/simple

Inference with runtime

Speech Recognition

Paraformer

from funasr_onnx import Paraformer
from pathlib import Path

model_dir = "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
model = Paraformer(model_dir, batch_size=1, quantize=True)

wav_path = ['{}/.cache/modelscope/hub/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav'.format(Path.home())]

result = model(wav_path)
print(result)
  • model_dir: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain model.onnx, config.yaml, am.mvn
  • batch_size: 1 (Default), the batch size duration inference
  • device_id: -1 (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)
  • quantize: False (Default), load the model of model.onnx in model_dir. If set True, load the model of model_quant.onnx in model_dir
  • intra_op_num_threads: 4 (Default), sets the number of threads used for intraop parallelism on CPU

Input: wav formt file, support formats: str, np.ndarray, List[str]

Output: List[str]: recognition result

Paraformer-online

Voice Activity Detection

FSMN-VAD

from funasr_onnx import Fsmn_vad
from pathlib import Path

model_dir = "damo/speech_fsmn_vad_zh-cn-16k-common-pytorch"
wav_path = '{}/.cache/modelscope/hub/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/example/vad_example.wav'.format(Path.home())

model = Fsmn_vad(model_dir)

result = model(wav_path)
print(result)
  • model_dir: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain model.onnx, config.yaml, am.mvn
  • batch_size: 1 (Default), the batch size duration inference
  • device_id: -1 (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)
  • quantize: False (Default), load the model of model.onnx in model_dir. If set True, load the model of model_quant.onnx in model_dir
  • intra_op_num_threads: 4 (Default), sets the number of threads used for intraop parallelism on CPU

Input: wav formt file, support formats: str, np.ndarray, List[str]

Output: List[str]: recognition result

FSMN-VAD-online

from funasr_onnx import Fsmn_vad_online
import soundfile
from pathlib import Path

model_dir = "damo/speech_fsmn_vad_zh-cn-16k-common-pytorch"
wav_path = '{}/.cache/modelscope/hub/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/example/vad_example.wav'.format(Path.home())

model = Fsmn_vad_online(model_dir)


##online vad
speech, sample_rate = soundfile.read(wav_path)
speech_length = speech.shape[0]
#
sample_offset = 0
step = 1600
param_dict = {'in_cache': []}
for sample_offset in range(0, speech_length, min(step, speech_length - sample_offset)):
    if sample_offset + step >= speech_length - 1:
        step = speech_length - sample_offset
        is_final = True
    else:
        is_final = False
    param_dict['is_final'] = is_final
    segments_result = model(audio_in=speech[sample_offset: sample_offset + step],
                            param_dict=param_dict)
    if segments_result:
        print(segments_result)
  • model_dir: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain model.onnx, config.yaml, am.mvn
  • batch_size: 1 (Default), the batch size duration inference
  • device_id: -1 (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)
  • quantize: False (Default), load the model of model.onnx in model_dir. If set True, load the model of model_quant.onnx in model_dir
  • intra_op_num_threads: 4 (Default), sets the number of threads used for intraop parallelism on CPU

Input: wav formt file, support formats: str, np.ndarray, List[str]

Output: List[str]: recognition result

Punctuation Restoration

CT-Transformer

from funasr_onnx import CT_Transformer

model_dir = "damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch"
model = CT_Transformer(model_dir)

text_in="跨境河流是养育沿岸人民的生命之源长期以来为帮助下游地区防灾减灾中方技术人员在上游地区极为恶劣的自然条件下克服巨大困难甚至冒着生命危险向印方提供汛期水文资料处理紧急事件中方重视印方在跨境河流问题上的关切愿意进一步完善双方联合工作机制凡是中方能做的我们都会去做而且会做得更好我请印度朋友们放心中国在上游的任何开发利用都会经过科学规划和论证兼顾上下游的利益"
result = model(text_in)
print(result[0])
  • model_dir: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain model.onnx, config.yaml, am.mvn
  • device_id: -1 (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)
  • quantize: False (Default), load the model of model.onnx in model_dir. If set True, load the model of model_quant.onnx in model_dir
  • intra_op_num_threads: 4 (Default), sets the number of threads used for intraop parallelism on CPU

Input: str, raw text of asr result

Output: List[str]: recognition result

CT-Transformer-online

from funasr_onnx import CT_Transformer_VadRealtime

model_dir = "damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727"
model = CT_Transformer_VadRealtime(model_dir)

text_in  = "跨境河流是养育沿岸|人民的生命之源长期以来为帮助下游地区防灾减灾中方技术人员|在上游地区极为恶劣的自然条件下克服巨大困难甚至冒着生命危险|向印方提供汛期水文资料处理紧急事件中方重视印方在跨境河流>问题上的关切|愿意进一步完善双方联合工作机制|凡是|中方能做的我们|都会去做而且会做得更好我请印度朋友们放心中国在上游的|任何开发利用都会经过科学|规划和论证兼顾上下游的利益"

vads = text_in.split("|")
rec_result_all=""
param_dict = {"cache": []}
for vad in vads:
    result = model(vad, param_dict=param_dict)
    rec_result_all += result[0]

print(rec_result_all)
  • model_dir: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain model.onnx, config.yaml, am.mvn
  • device_id: -1 (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)
  • quantize: False (Default), load the model of model.onnx in model_dir. If set True, load the model of model_quant.onnx in model_dir
  • intra_op_num_threads: 4 (Default), sets the number of threads used for intraop parallelism on CPU

Input: str, raw text of asr result

Output: List[str]: recognition result

Performance benchmark

Please ref to benchmark

Acknowledge

  1. This project is maintained by FunASR community.
  2. We partially refer SWHL for onnxruntime (only for paraformer model).