doc/update README and CHANGELOG

This commit is contained in:
Sun Xiang Yu 2020-03-31 18:41:38 +08:00
parent f0853199fb
commit ad755e891b
4 changed files with 43 additions and 12 deletions

View File

@ -1,18 +1,24 @@
# Change log for esp-sr
## 0.6.0
update multinet_cn_1.4 and add CONTINUOUS RECOGNITION mode
improve hilexin wakeNet5X3 model(v5)
support IDFv4.0 build system
replace MAP algorithm with MASE(Mic Array Speech Enhancement) algorithm v1.0
## 0.5.0
add multinet1 English model v1.0
update multinet1 Chinese model v2.0
add Mic Array Processing(MAP) algorithm
Fix the bug of parsing speech command
fix the bug of decoder
add multinet1 English model v1.0
update multinet1 Chinese model v2.0
add Mic Array Processing(MAP) algorithm
Fix the bug of parsing speech command
fix the bug of decoder
## 0.3.0
add wakenet6
support cmake
add free wake word: hi jeson
update wakenet5X3 wake word model(v2)
add wakenet6
support cmake
add free wake word: hi jeson
update wakenet5X3 wake word model(v2)
## 0.2.0
add acoustic algorithm, include AEC, AGC, VAD ,NS

View File

@ -4,7 +4,7 @@ Espressif esp_sr provides basic algorithms for **Speech Recognition** applicatio
* The wake word detection model [WakeNet](wake_word_engine/README.md)
* The speech command recognition model [MultiNet](speech_command_recognition/README.md)
* Acoustic algorithm: AEC(Acoustic Echo Cancellation), VAD(Voice Activity Detection), AGC(Automatic Gain Control), NS(Noise Suppression)
* Acoustic algorithm: MASE(Mic Array Speech Enhancement), AEC(Acoustic Echo Cancellation), VAD(Voice Activity Detection), AGC(Automatic Gain Control), NS(Noise Suppression)
These algorithms are provided in the form of a component, so they can be integrated into your projects with minimum efforts.
@ -19,3 +19,16 @@ Currently, Espressif has not only provided an official wake word "Hi, Lexin" to
Espressif's speech command recognition model [MultiNet](speech_command_recognition/README.md) is specially designed to provide a flexible off-line speech command recognition model. With this model, you can easily add your own speech commands, eliminating the need to train model again.
Currently, Espressif **MultiNet** supports up to 100 Chinese or English speech commands, such as “打开空调” (Turn on the air conditioner) and “打开卧室灯” (Turn on the bedroom light).
## Acoustic algorithm
Espressif acoustic algorithm module is specially designed to improve speech recognition performance in far-field or noisy environments.
Currently, MASE algorithm supports 2-mic linear array and 3-mic circular array.
**In order to achieve optimal performance:**
* Please refer to hardware design [ESP32_Korvo](https://github.com/espressif/esp-skainet/tree/master/docs/zh_CN/hw-reference/esp32/user-guide-esp32-korvo-v1.1.md) or [ESP32-LyraT-Mini](https://docs.espressif.com/projects/esp-adf/en/latest/get-started/get-started-esp32-lyrat-mini.html).
* Please refer to software design [esp-skainet](https://github.com/espressif/esp-skainet).

View File

@ -4,7 +4,7 @@ esp_sr 提供语音识别相关方向算法模型,目前主要包括三个模
* 唤醒词识别模型 [WakeNet](wake_word_engine/README_cn.md)
* 语音命令识别模型 [MultiNet](speech_command_recognition/README_cn.md)
* 声学算法AEC(Acoustic Echo Cancellation), VAD(Voice Activity Detection), AGC(Automatic Gain Control), NS(Noise Suppression)
* 声学算法:MASE(Mic Array Speech Enhancement), AEC(Acoustic Echo Cancellation), VAD(Voice Activity Detection), AGC(Automatic Gain Control), NS(Noise Suppression)
这些算法以组件的形式提供,因此可以轻松地将它们集成到您的项目中。
@ -20,3 +20,14 @@ esp_sr 提供语音识别相关方向算法模型,目前主要包括三个模
目前模型支持类似“打开空调”,“打开卧室灯”等中文命令词识别和"Turn on/off the light" 等英文命令词识别,自定义语音命令词最大个数为 100。
## 声学算法
声学算法模块, 致力于提高复杂声学环境下的语音识别性能。MASE算法可有效改善远程或嘈杂环境下的语音识别性能。
目前MASE算法支持2-mic线性阵列和3-mic环形阵列。
**算法性能与硬件设计与软件配置息息相关,为达到最优性能:**
* 硬件设计建议参考 [ESP32_Korvo](https://github.com/espressif/esp-skainet/tree/master/docs/zh_CN/hw-reference/esp32/user-guide-esp32-korvo-v1.1.md) 或 [ESP32-LyraT-Mini](https://docs.espressif.com/projects/esp-adf/en/latest/get-started/get-started-esp32-lyrat-mini.html)。
* 软件设计建议参考 [esp-skainet](https://github.com/espressif/esp-skainet) 中相关示例。

View File

@ -67,7 +67,8 @@ mase_handle_t mase_create(int sample_rate, int frame_size, int array_type, float
*
* @return None
*
* @note Input is a multi-channel signal while the output is single-channel. For a 16-ms multi-channel input frame, the i-th point in the c-th channel should be indexed (i + c * 256).
* @note Input is a multi-channel signal while the output is single-channel.
* For a 16-ms multi-channel input frame, the i-th point in the c-th channel should be indexed (i + c * 256).
*
*/
void mase_process(mase_handle_t st, int16_t *in, int16_t *dsp_out);