mirror of
https://github.com/modelscope/FunASR
synced 2025-09-15 14:48:36 +08:00
1.5 KiB
1.5 KiB
GPU Benchmark (libtorch-cpp)
Configuration
Data set:
A long audio test set(Non-open source) containing 103 audio files, with durations ranging from 2 to 30 minutes.
FSMN-VAD + Paraformer-large + CT-Transformer
./funasr-onnx-offline-rtf \
--model-dir ./damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript \
--vad-dir ./damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
--punc-dir ./damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
--gpu \
--thread-num 20 \
--bladedisc true \
--batch-size 20 \
--wav-path ./long_test.scp
Node: run in docker, ref to (docs)
Intel(R) Xeon(R) Platinum 8369B CPU @ 2.90GHz 16core-32processor with avx512_vnni, GPU @ A10
| concurrent-tasks | batch | RTF | Speedup Rate |
|---|---|---|---|
| 1 | 1 | 0.0076 | 130 |
| 1 | 20 | 0.0048 | 208 |
| 5 | 20 | 0.0011 | 850 |
| 10 | 20 | 0.0008 | 1200+ |
| 20 | 20 | 0.0008 | 1200+ |
Node: On CPUs, the single-thread RTF is 0.066, and 32-threads' speedup is 330+