FunASR/funasr/runtime/python/libtorch/README.md
2023-03-17 20:05:56 +08:00

68 lines
2.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

## Using paraformer with libtorch
### Introduction
- Model comes from [speech_paraformer](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary).
### Steps:
1. Export the model.
- Command: (`Tips`: torch >= 1.11.0 is required.)
```shell
python -m funasr.export.export_model [model_name] [export_dir] false
```
`model_name`: the model is to export.
`export_dir`: the dir where the onnx is export.
More details ref to ([export docs](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/export))
- `e.g.`, Export model from modelscope
```shell
python -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type torch --quantize False
```
- `e.g.`, Export model from local path, the model'name must be `model.pb`.
```shell
python -m funasr.export.export_model --model-name ./damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type torch --quantize False
```
2. Install the `torch_paraformer`.
```shell
git clone https://github.com/alibaba/FunASR.git && cd FunASR
cd funasr/runtime/python/libtorch
python setup.py install
```
3. Run the demo.
- Model_dir: the model path, which contains `model.torchscripts`, `config.yaml`, `am.mvn`.
- Input: wav formt file, support formats: `str, np.ndarray, List[str]`
- Output: `List[str]`: recognition result.
- Example:
```python
from torch_paraformer import Paraformer
model_dir = "/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
model = Paraformer(model_dir, batch_size=1)
wav_path = ['/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav']
result = model(wav_path)
print(result)
```
## Speed
EnvironmentIntel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
Test [wav, 5.53s, 100 times avg.](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav)
| Backend | RTF (FP32) |
|:--------:|:----------:|
| Pytorch | 0.110 |
| Libtorch | 0.048 |
| Onnx | 0.038 |
## Acknowledge