Merge branch 'feat/set_threshold' into 'master'

Feat/set threshold

See merge request speech-recognition-framework/esp-sr!155
This commit is contained in:
Sun Xiang Yu 2025-04-10 17:13:44 +08:00
commit 2d151b7193
19 changed files with 16 additions and 10 deletions

View File

@ -23,7 +23,6 @@ The new algorithms will no longer support ESP32 chips.
News
----
[14/2/2025]: We release **ESP-SR V2.0**. [Migration from ESP-SR V1.* to ESP-SR V2.*](https://docs.espressif.com/projects/esp-sr/en/latest/esp32s3/audio_front_end/migration_guide.html)
[13/2/2025]: We release **VADNet**, a voice activaty detection model. You can use it to replace the WebRTC VAD and improve the performance.
@ -89,11 +88,15 @@ The following MultiNet models are supported in esp-sr:
## Audio Front End
Espressif Audio Front-End **AFE** integrates AEC (Acoustic Echo Cancellation), VAD (Voice Activity Detection), BSS (Blind Source Separation) and NS (Noise Suppression).
Espressif Audio Front-End **AFE** integrates AEC (Acoustic Echo Cancellation), VAD (Voice Activity Detection), BSS (Blind Source Separation) and NS (Noise Suppression), NSNET(Deep noise suppression) and other functions. It is designed to be used with the ESP-SR library.
Our two-mic Audio Front-End (AFE) have been qualified as a “Software Audio Front-End Solution” for [Amazon Alexa Built-in devices](https://developer.amazon.com/en-US/alexa/solution-providers/alexa-connect-kit).
**In order to achieve optimal performance:**
## Documentation and Resources
ESP-SR Documentation: [ESP-SR Documentation](https://docs.espressif.com/projects/esp-sr/en/latest/esp32s3/index.html)
Migration Guide: [Migration from V1.* to V2.*](https://docs.espressif.com/projects/esp-sr/en/latest/esp32s3/audio_front_end/migration_guide.html)
Wake Word Training: [Wake Word Training by TTS Pipeline V2.0](https://github.com/espressif/esp-sr/issues/88)
Examples: [esp-skainet/examples](https://github.com/espressif/esp-skainet)
* Please refer to software design [esp-skainet](https://github.com/espressif/esp-skainet).

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -235,6 +235,7 @@ TEST_CASE("multinet set commands from sdkconfig and detect", "[mn]")
esp_mn_error_t *error_phrases = NULL;
esp_mn_commands_update_from_sdkconfig(multinet, model_data);
multinet->print_active_speech_commands(model_data);
multinet->set_det_threshold(model_data, 0.1);
while (1) {
if ((chunks + 1)*audio_chunksize <= data_size) {

View File

@ -5,19 +5,21 @@ CONFIG_IDF_TARGET="esp32p4"
CONFIG_ESPTOOLPY_FLASHMODE_QIO=y
CONFIG_ESPTOOLPY_FLASHSIZE_16MB=y
CONFIG_PARTITION_TABLE_CUSTOM=y
CONFIG_SR_NSN_NSNET2=y
CONFIG_SR_VADN_VADNET1_MEDIUM=y
CONFIG_SR_WN_WN9_HILEXIN=y
CONFIG_SR_NSN_NSNET2=y
CONFIG_SPIRAM=y
CONFIG_ESP_TASK_WDT_EN=n
CONFIG_ESP_TASK_WDT_INIT=n
CONFIG_ESP_MAIN_TASK_STACK_SIZE=10240
CONFIG_COMPILER_OPTIMIZATION_PERF=y
CONFIG_COMPILER_ORPHAN_SECTIONS_PLACE=y
CONFIG_ESP32P4_REV_MIN_0=y
CONFIG_SPIRAM=y
CONFIG_SPIRAM_SPEED_200M=y
CONFIG_CACHE_L2_CACHE_256KB=y
CONFIG_CACHE_L2_CACHE_LINE_128B=y
CONFIG_ESP_SYSTEM_ALLOW_RTC_FAST_MEM_AS_HEAP=n
CONFIG_ESP_MAIN_TASK_STACK_SIZE=10240
CONFIG_ESP_TASK_WDT_EN=n
CONFIG_LOG_COLORS=y
CONFIG_LWIP_HOOK_IP6_INPUT_NONE=y
CONFIG_MBEDTLS_CMAC_C=y
CONFIG_IDF_EXPERIMENTAL_FEATURES=y
CONFIG_MBEDTLS_ECP_FIXED_POINT_OPTIM=y
CONFIG_IDF_EXPERIMENTAL_FEATURES=y