mirror of
https://github.com/espressif/esp-sr.git
synced 2025-09-15 15:28:44 +08:00
docs: update mn6 docs
This commit is contained in:
parent
ccef4e046e
commit
e0cae984cb
@ -154,21 +154,44 @@ Resource Occupancy
|
||||
| MultiNet 6 | 48 KB | 4000 KB | 12 ms | 32 ms |
|
||||
+-------------+-------------+-------------+-------------+-------------+
|
||||
|
||||
Performance Test
|
||||
~~~~~~~~~~~~~~~~
|
||||
Word Error Rate Performance Test
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
+-----------+-----------+----------+------------+-----------+
|
||||
| Model | Distance | Quiet | Stationary | Speech |
|
||||
| Type | | | Noise | Noise |
|
||||
| | | | (SNR = 4 | (SNR = 4 |
|
||||
| | | | dB) | dB) |
|
||||
+===========+===========+==========+============+===========+
|
||||
| MultiNet | 3 m | 98% | 93% | 92% |
|
||||
| 4 | | | | |
|
||||
+-----------+-----------+----------+------------+-----------+
|
||||
| MultiNet | 3 m | 94% | 92% | 91% |
|
||||
| 4 Q8 | | | | |
|
||||
+-----------+-----------+----------+------------+-----------+
|
||||
+-----------+-----------+
|
||||
| Model | aishell |
|
||||
| Type | test |
|
||||
+===========+===========+
|
||||
| MultiNet | 9.5% |
|
||||
| 5_cn | |
|
||||
+-----------+-----------+
|
||||
| MultiNet | 5.2% |
|
||||
| 6_cn | |
|
||||
+-----------+-----------+
|
||||
|
||||
+-------------+-------------+-------------+
|
||||
| Distance | librispeech | librispeech |
|
||||
| | test-clean | test-other |
|
||||
+=============+=============+=============+
|
||||
| MultiNet5-en| 16.5% | 41.4% |
|
||||
+-------------+-------------+-------------+
|
||||
| MultiNet6-en| 9.0% | 21.3% |
|
||||
+-------------+-------------+-------------+
|
||||
|
||||
speech commands Performance Test
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
+-----------+-----------+----------+------------+-------------+
|
||||
| Model | Distance | Quiet | Stationary | Speech |
|
||||
| Type | | | Noise | Noise |
|
||||
| | | | (SNR=5~10dB| (SNR=5~10dB |
|
||||
| | | | dB) | dB) |
|
||||
+===========+===========+==========+============+=============+
|
||||
| MultiNet | 3 m | | | |
|
||||
| 5_cn | | | | |
|
||||
+-----------+-----------+----------+------------+-------------+
|
||||
| MultiNet | 3 m | | | |
|
||||
| 6_cm | | | | |
|
||||
+-----------+-----------+----------+------------+-------------+
|
||||
|
||||
|
||||
TTS
|
||||
|
||||
@ -42,31 +42,18 @@ Please see the flow diagram for commands recognition below:
|
||||
|
||||
.. _command-requirements:
|
||||
|
||||
Format of Speech Commands
|
||||
-------------------------------
|
||||
|
||||
Different MultiNets support different format:
|
||||
|
||||
- MultiNet5 use phonemes for English speech commands. For simplicity, we use characters to denote different phonemes. Please use :project_file:`tool/multinet_g2p.py` to do the convention.
|
||||
- MultiNet6 use grapheme for English speech commands. You do not need any conversion.
|
||||
|
||||
Suggestions on Customizing Speech Commands
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When customizing speech command words, please pay attention to the following suggestions:
|
||||
|
||||
.. list::
|
||||
|
||||
:esp32s3: - The recommended length of English speech commands is generally 4-6 words
|
||||
- Mixed Chinese and English is not supported in command words
|
||||
- The command word cannot contain Arabic numerals and special characters
|
||||
|
||||
Speech Commands Customization Methods
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
--------------------------------------
|
||||
|
||||
.. note::
|
||||
Mixed Chinese and English is not supported in command words.
|
||||
|
||||
The command word cannot contain Arabic numerals and special characters.
|
||||
|
||||
|
||||
MultiNet6 customize speech commands
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
- Words are used as units. Please modify a text file :project_file:`model/multinet_model/fst/commands_en.txt` by the following format:
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
MultiNet6 use grapheme for English speech commands. You can add/modify speech commands by words directly. Please modify a text file :project_file:`model/multinet_model/fst/commands_en.txt` by the following format:
|
||||
|
||||
::
|
||||
|
||||
@ -76,8 +63,9 @@ MultiNet6 customize speech commands
|
||||
|
||||
|
||||
MultiNet5 customize speech commands
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
MultiNet5 use phonemes for English speech commands. For simplicity, we use characters to denote different phonemes. Please use :project_file:`tool/multinet_g2p.py` to do the convention.
|
||||
There are two methods to customize speech commands offline:
|
||||
|
||||
- Via ``menuconfig``
|
||||
@ -110,10 +98,6 @@ There are two methods to customize speech commands offline:
|
||||
*/
|
||||
esp_err_t esp_mn_commands_update_from_sdkconfig(esp_mn_iface_t *multinet, const model_iface_data_t *model_data);
|
||||
|
||||
- Via modifying code
|
||||
|
||||
Users directly customize the speech commands in the code and pass these commands to the MultiNet. In the actual user scenarios, users can pass these commands via various interfaces including network / UART / SPI. For detailed description of APIs. Please refer to :project_file:`src/esp_mn_speech_commands.c` and examples described in ESP-Skainet.
|
||||
|
||||
Use MultiNet
|
||||
------------
|
||||
|
||||
|
||||
@ -154,22 +154,44 @@ MultiNet
|
||||
| MultiNet 6 | 48 KB | 4000 KB | 12 ms | 32 ms |
|
||||
+-------------+-------------+-------------+-------------+-------------+
|
||||
|
||||
性能测试
|
||||
~~~~~~~~
|
||||
Word Error Rate 性能测试
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
+-----------+-----------+----------+------------+-----------+
|
||||
| Model | Distance | Quiet | Stationary | Speech |
|
||||
| Type | | | Noise | Noise |
|
||||
| | | | (SNR = 4 | (SNR = 4 |
|
||||
| | | | dB) | dB) |
|
||||
+===========+===========+==========+============+===========+
|
||||
| MultiNet | 3 m | 98% | 93% | 92% |
|
||||
| 4 | | | | |
|
||||
+-----------+-----------+----------+------------+-----------+
|
||||
| MultiNet | 3 m | 94% | 92% | 91% |
|
||||
| 4 Q8 | | | | |
|
||||
+-----------+-----------+----------+------------+-----------+
|
||||
+-----------+-----------+
|
||||
| Model | aishell |
|
||||
| Type | test |
|
||||
+===========+===========+
|
||||
| MultiNet | 9.5% |
|
||||
| 5_cn | |
|
||||
+-----------+-----------+
|
||||
| MultiNet | 5.2% |
|
||||
| 6_cn | |
|
||||
+-----------+-----------+
|
||||
|
||||
+-------------+-------------+-------------+
|
||||
| Distance | librispeech | librispeech |
|
||||
| | test-clean | test-other |
|
||||
+=============+=============+=============+
|
||||
| MultiNet5-en| 16.5% | 41.4% |
|
||||
+-------------+-------------+-------------+
|
||||
| MultiNet6-en| 9.0% | 21.3% |
|
||||
+-------------+-------------+-------------+
|
||||
|
||||
speech commands 性能测试
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
+-----------+-----------+----------+------------+-------------+
|
||||
| Model | Distance | Quiet | Stationary | Speech |
|
||||
| Type | | | Noise | Noise |
|
||||
| | | | (SNR=5~10dB| (SNR=5~10dB |
|
||||
| | | | dB) | dB) |
|
||||
+===========+===========+==========+============+=============+
|
||||
| MultiNet | 3 m | | | |
|
||||
| 5_cn | | | | |
|
||||
+-----------+-----------+----------+------------+-------------+
|
||||
| MultiNet | 3 m | | | |
|
||||
| 6_cm | | | | |
|
||||
+-----------+-----------+----------+------------+-------------+
|
||||
|
||||
TTS
|
||||
---
|
||||
|
||||
@ -47,36 +47,21 @@ MultiNet 输入为经过前端语音算法(AFE)处理过的音频(格式
|
||||
|
||||
不同版本的MultiNet命令词格式不同。命令词需要满足特定的格式,具体如下:
|
||||
|
||||
MultiNet5和MultiNet6使用汉语拼音作为基本识别单元,并且每个字的拼音拼写间隔一个空格。比如“打开空调”,应该写成 “da kai kong tiao”,请使用以下工具将汉字转为拼音: :project_file:`tool/multinet_pinyin.py` 。
|
||||
|
||||
|
||||
自定义要求
|
||||
~~~~~~~~~~~
|
||||
自定义命令词方法
|
||||
--------------
|
||||
|
||||
在设计命令词时有如下要求和建议:
|
||||
.. note::
|
||||
不支持中英文混合
|
||||
不能含有阿拉伯数字和特殊字符
|
||||
|
||||
.. list::
|
||||
MultiNet 支持多种且灵活的命令词设置方式,可通过在线或离线方法设置命令词,还允许随时动态增加/删除/修改命令词。
|
||||
|
||||
- 中文推荐长度一般为 4-6 个汉字,过短导致误识别率高,过长不方便用户记忆
|
||||
:esp32s3: - 英文推荐长度一般为 4-6 个单词
|
||||
- 不支持中英文混合
|
||||
- 不能含有阿拉伯数字和特殊字符
|
||||
- 应避免使用常用语
|
||||
- 命令词中每个汉字/单词的发音相差越大越好
|
||||
|
||||
自定义方法
|
||||
~~~~~~~~~~~
|
||||
|
||||
MultiNet 支持多种且灵活的命令词设置方式,可通过在线或离线方法设置命令词,还允许随时动态增加/删除/修改命令词
|
||||
|
||||
.. only:: latex
|
||||
|
||||
.. figure:: ../../_static/QR_multinet_g2p.png
|
||||
:alt: menuconfig_add_speech_commands
|
||||
|
||||
离线设置命令词
|
||||
^^^^^^^^^^^^^^^
|
||||
|
||||
MultiNet6 离线设置命令词的方法:
|
||||
MultiNet5和MultiNet6使用汉语拼音作为基本识别单元。比如“打开空调”,应该写成 “da kai kong tiao”,请使用以下工具将汉字转为拼音: :project_file:`tool/multinet_pinyin.py` 。
|
||||
|
||||
MultiNet6 定义方法:
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- 中文通过修改 :project_file:`model/multinet_model/fst/commands_cn.txt`
|
||||
|
||||
@ -88,7 +73,8 @@ MultiNet6 离线设置命令词的方法:
|
||||
1 da kai kong tiao
|
||||
2 guan bi kong tiao
|
||||
|
||||
MultiNet5 离线设置命令词的方法:
|
||||
MultiNet5 定义方法:
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- 通过 ``menuconfig``
|
||||
|
||||
@ -120,9 +106,6 @@ MultiNet5 离线设置命令词的方法:
|
||||
*/
|
||||
esp_err_t esp_mn_commands_update_from_sdkconfig(esp_mn_iface_t *multinet, const model_iface_data_t *model_data);
|
||||
|
||||
- 通过修改代码
|
||||
|
||||
该方法中,用户直接在代码中编写命令词,并传给 MultiNet。在实际产品开发和使用中,用户可以通过网络/UART/SPI 等多种接口,传递所需的命令词并随时更换命令词。具体 API 说明请参考 :project_file:`src/esp_mn_speech_commands.c` 和 ESP-Skainet 中的 example。
|
||||
|
||||
MultiNet 的使用
|
||||
----------------
|
||||
|
||||
Loading…
Reference in New Issue
Block a user