mirror of
https://github.com/espressif/esp-sr.git
synced 2025-09-15 15:28:44 +08:00
doc: Update mn6 docs
This commit is contained in:
parent
1e85c1a480
commit
9a360671a8
@ -151,7 +151,7 @@ Resource Occupancy
|
||||
| MultiNet 5 | 16 KB | 2310 KB | 12 ms | 32 ms |
|
||||
| Q8 | | | | |
|
||||
+-------------+-------------+-------------+-------------+-------------+
|
||||
| MultiNet 6 | 52 KB | 4400 KB | 12 ms | 32 ms |
|
||||
| MultiNet 6 | 48 KB | 4000 KB | 12 ms | 32 ms |
|
||||
+-------------+-------------+-------------+-------------+-------------+
|
||||
|
||||
Performance Test
|
||||
|
||||
@ -23,17 +23,7 @@ MultiNet is a lightweight model designed to recognize multiple speech command wo
|
||||
|
||||
The MultiNet input is the audio processed by the audio-front-end algorithm (AFE), with the format of 16 KHz, 16 bit and mono. By recognizing the audio signals, speech commands can be recognized.
|
||||
|
||||
The following table shows the models supported by Espressif SoCs:
|
||||
|
||||
+---------+-----------+-------------+---------------+-------------+
|
||||
| Chip | ESP32 | ESP32S3 |
|
||||
+=========+===========+=============+===============+=============+
|
||||
| Model | MultiNet2 | MultiNet4.5 | MultiNet4.5Q8 | MultiNet5Q8 |
|
||||
+---------+-----------+-------------+---------------+-------------+
|
||||
| Chinese | √ | √ | √ | √ |
|
||||
+---------+-----------+-------------+---------------+-------------+
|
||||
| English | | | | √ |
|
||||
+---------+-----------+-------------+---------------+-------------+
|
||||
Please refer to :doc:`Models Benchmark <../benchmark/README>` to check models supported by Espressif SoCs.
|
||||
|
||||
For details on flash models, see Section :doc:`Flashing Models <../flash_model/README>` .
|
||||
|
||||
|
||||
@ -151,7 +151,7 @@ MultiNet
|
||||
| MultiNet 5 | 16 KB | 2310 KB | 12 ms | 32 ms |
|
||||
| Q8 | | | | |
|
||||
+-------------+-------------+-------------+-------------+-------------+
|
||||
| MultiNet 6 | 52 KB | 4400 KB | 12 ms | 32 ms |
|
||||
| MultiNet 6 | 48 KB | 4000 KB | 12 ms | 32 ms |
|
||||
+-------------+-------------+-------------+-------------+-------------+
|
||||
|
||||
性能测试
|
||||
|
||||
@ -23,17 +23,7 @@ MultiNet 是为了在 {IDF_TARGET_NAME} 系列上离线实现多命令词识别
|
||||
|
||||
MultiNet 输入为经过前端语音算法(AFE)处理过的音频(格式为 16 KHz,16 bit,单声道)。通过对音频进行识别,则可以对应到相应的汉字或单词。
|
||||
|
||||
以下表格展示在不同芯片上的模型支持:
|
||||
|
||||
+---------+-----------+-------------+---------------+-------------+
|
||||
| Chip | ESP32 | ESP32S3 |
|
||||
+=========+===========+=============+===============+=============+
|
||||
| Model | MultiNet2 | MultiNet4.5 | MultiNet4.5Q8 | MultiNet5Q8 |
|
||||
+---------+-----------+-------------+---------------+-------------+
|
||||
| Chinese | √ | √ | √ | √ |
|
||||
+---------+-----------+-------------+---------------+-------------+
|
||||
| English | | | | √ |
|
||||
+---------+-----------+-------------+---------------+-------------+
|
||||
请参考 :doc:`Models Benchmark <../benchmark/README>` 去查看当前不同芯片支持的模型。
|
||||
|
||||
用户选择不同的模型的方法请参考 :doc:`模型加载 <../flash_model/README>` 。
|
||||
|
||||
|
||||
@ -1,3 +1,26 @@
|
||||
## MultiNet6
|
||||
|
||||
#### Step 1. Data preparation
|
||||
|
||||
For English, words are used as units. Please prepare a list of commands written in a text file `commands_en.txt` of the following format:
|
||||
|
||||
```
|
||||
# command_id command_sentence
|
||||
1 TELL ME A JOKE
|
||||
2 MAKE A COFFEE
|
||||
```
|
||||
|
||||
For Chinese, pinyin are used as units. Please prepare a list of commands written in a text file `commands_cn.txt` of the following format:
|
||||
```
|
||||
# command_id command_sentence
|
||||
1 da kai kong tiao
|
||||
2 guan bi kong tiao
|
||||
```
|
||||
|
||||
#### Step 2. Move created files
|
||||
|
||||
1. Move your `commands_en.txt` or `commands_cn.txt` to `/model/multinet_model/fst/`
|
||||
|
||||
## MultiNet5
|
||||
#### 1. Install g2p_en, please refer to https://pypi.org/project/g2p-en/
|
||||
|
||||
@ -42,45 +65,3 @@ multinet->reset(model_data, new_commands_str, err_id);
|
||||
// turn off the light -> commond id=2
|
||||
```
|
||||
|
||||
## MultiNet6
|
||||
|
||||
|
||||
The FST (Finite State Transducer) is used to save a list of commands.
|
||||
|
||||
#### Step 1. Data preparation
|
||||
|
||||
Requirements:
|
||||
- python>3.8
|
||||
- sentencepiece
|
||||
|
||||
To create a FST from a list of commands, two files are needed:
|
||||
- commands.txt: maps a command id to subwords
|
||||
- tokens.txt: maps subword tokens to it's indices in the bpe model
|
||||
|
||||
Assume you have a list of commands written in a text file `commands_list.txt` of the following format:
|
||||
|
||||
```
|
||||
# command_id command_sentence
|
||||
1 TELL ME A JOKE
|
||||
2 MAKE A COFFEE
|
||||
```
|
||||
**Note**: command ids starts from 1, 0 is reserved in FST.
|
||||
|
||||
Run the following command to create the required files, do not change the filenames `commands.txt` and `tokens.txt`.
|
||||
|
||||
```sh
|
||||
pip install -r requirements.txt
|
||||
|
||||
python fst/prepare_for_fst.py \
|
||||
--infile commands_list.txt \
|
||||
--bpe-model fst/bpe.model \
|
||||
--out-command-list commands.txt \
|
||||
--out-token-symbols tokens.txt
|
||||
```
|
||||
|
||||
#### Step 2. Move created files
|
||||
|
||||
1. Remove `/model/multinet_model/fst/fst.txt` and `/model/multinet_model/fst/fst_reversed.txt` if those files exist.
|
||||
2. Move the following files to `/model/multinet_model/fst/`
|
||||
- commands.txt
|
||||
- tokens.txt
|
||||
Loading…
Reference in New Issue
Block a user