mirror of
https://github.com/modelscope/FunASR
synced 2025-09-15 14:48:36 +08:00
34 lines
1.5 KiB
Markdown
34 lines
1.5 KiB
Markdown
# GPU Benchmark (libtorch-cpp)
|
|
|
|
## Configuration
|
|
### Data set:
|
|
A long audio test set(Non-open source) containing 103 audio files, with durations ranging from 2 to 30 minutes.
|
|
|
|
## [FSMN-VAD](https://www.modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx/summary) + [Paraformer-large](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript/summary) + [CT-Transformer](https://www.modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx/summary)
|
|
|
|
```shell
|
|
./funasr-onnx-offline-rtf \
|
|
--model-dir ./damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript \
|
|
--vad-dir ./damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
|
|
--punc-dir ./damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
|
|
--gpu \
|
|
--thread-num 20 \
|
|
--bladedisc true \
|
|
--batch-size 20 \
|
|
--wav-path ./long_test.scp
|
|
```
|
|
Node: run in docker, ref to ([docs](./SDK_advanced_guide_offline_gpu_zh.md))
|
|
|
|
### Intel(R) Xeon(R) Platinum 8369B CPU @ 2.90GHz 16core-32processor with avx512_vnni, GPU @ A10
|
|
|
|
| concurrent-tasks | batch | RTF | Speedup Rate |
|
|
|------------------|:------:|:------:|:------------:|
|
|
| 1 | 1 | 0.0076 | 130 |
|
|
| 1 | 20 | 0.0048 | 208 |
|
|
| 5 | 20 | 0.0011 | 850 |
|
|
| 10 | 20 | 0.0008 | 1200+ |
|
|
| 20 | 20 | 0.0008 | 1200+ |
|
|
|
|
Node: On CPUs, the single-thread RTF is 0.066, and 32-threads' speedup is 330+
|
|
|