esp-sr/docs/wake_word_engine
Sun Xiangyu 55a37b51b9
Merge pull request #33 from vozvivan/patch-1
doc(wakenet&multinet): update required format of training corpus audi… (AIS-838)
2022-10-12 20:16:34 +08:00
..
ESP_Wake_Words_Customization.md doc(wakenet&multinet): update required format of training corpus audio files 2022-02-26 12:02:12 +03:00
README_cn.md doc: Update flash_model,wakenet,multinet doc 2022-07-08 16:19:22 +08:00
README.md doc(mn): Fix the error link in speech_commands_recognition/README 2022-07-15 10:47:55 +08:00
乐鑫语音唤醒词定制流程.md feat(system): Refactor esp-sr; Accelerate model reading 2021-11-03 17:18:37 +08:00

wakeNet

wakeNet, which is a wake word engine built upon neural network, is specially designed for low-power embedded MCUs. Now, the wakeNet model supports up to 5 wake words.

Overview

Please see the flow diagram of wakeNet below:

  • Speech Feature:
    The wakeNet uses MFCC to obtain the features of the input audio clip (16 KHz, 16 bit, single track). The window width and step width of each frame of the audio clip are both 30 ms.

  • Neural Network:
    Now, the neural network structure has been updated to the sixth edition, among which,

    • wakeNet1,wakeNet2,wakeNet3,wakeNet4,wakeNet6,wakeNet7 had been out of use.
    • wakeNet5 only support ESP32 chip.
    • wakeNet8,wakeNet9 only support ESP32S3 chip, which are built upon the Dilated Convolution structure. Note thatThe network structure of wakeNet5,wakeNet5X2 and wakeNet5X3 is same, but the parameter of wakeNetX2 and wakeNetX3 is more than wakeNet5. Please refer to Resource Occupancy for details.
  • Keyword Triggering Method
    For continuous audio stream, we calculate the average recognition results (M) for several frames and generate a smoothing prediction result, to improve the accuracy of keyword triggering. Only when the M value is larger than the set threshold, a triggering command is sent.

The following table shows the models supported by Espressif SoCs:

SoCs wakeNet5 wakeNet8 wakeNet9
ESP32 Yes No No
ESP32S3 No Yes Yes

Use wakeNet

  • How to select the wakeNet model

    Please refer to Flash model 介绍.

  • How to run wakeNet

    wakeNet is currently included in the AFE, which is running by default, and returns the detect results through the AFE fetch interface.

    If users wants to close wakeNet, please use:

    afe_config.wakeNet_init = False.
    

Performance Test

Please refer to Performance_test.

Wake Word Customization

For details on how to customize your wake words, please see Espressif Speech Wake Word Customization Process.