Merge branch 'feat/set_threshold' into 'master'

Feat/set threshold See merge request speech-recognition-framework/esp-sr!155
2025-09-15 15:28:44 +08:00 · 2025-04-10 17:13:44 +08:00 · 2025-04-10 17:13:44 +08:00 · 2d151b7193
commit 2d151b7193
parent 06ad27cab0 5b5b69630b
19 changed files with 16 additions and 10 deletions
--- a/README.md
+++ b/README.md
@ -23,7 +23,6 @@ The new algorithms will no longer support ESP32 chips.

 News
 ----
-
 [14/2/2025]: We release **ESP-SR V2.0**. [Migration from ESP-SR V1.* to ESP-SR V2.*](https://docs.espressif.com/projects/esp-sr/en/latest/esp32s3/audio_front_end/migration_guide.html)   
 [13/2/2025]: We release **VADNet**, a voice activaty detection model. You can use it to replace the WebRTC VAD and improve the performance.

@ -89,11 +88,15 @@ The following MultiNet models are supported in esp-sr:

 ## Audio Front End

-Espressif Audio Front-End **AFE** integrates AEC (Acoustic Echo Cancellation), VAD (Voice Activity Detection), BSS (Blind Source Separation) and NS (Noise Suppression).
+Espressif Audio Front-End **AFE** integrates AEC (Acoustic Echo Cancellation), VAD (Voice Activity Detection), BSS (Blind Source Separation) and NS (Noise Suppression), NSNET(Deep noise suppression) and other functions. It is designed to be used with the ESP-SR library.

 Our two-mic Audio Front-End (AFE) have been qualified as a “Software Audio Front-End Solution” for [Amazon Alexa Built-in devices](https://developer.amazon.com/en-US/alexa/solution-providers/alexa-connect-kit).


-**In order to achieve optimal performance:**
+## Documentation and Resources
+
+ESP-SR Documentation: [ESP-SR Documentation](https://docs.espressif.com/projects/esp-sr/en/latest/esp32s3/index.html)
+Migration Guide: [Migration from V1.* to V2.*](https://docs.espressif.com/projects/esp-sr/en/latest/esp32s3/audio_front_end/migration_guide.html)
+Wake Word Training: [Wake Word Training by TTS Pipeline V2.0](https://github.com/espressif/esp-sr/issues/88)
+Examples: [esp-skainet/examples](https://github.com/espressif/esp-skainet)

-* Please refer to software design [esp-skainet](https://github.com/espressif/esp-skainet).
--- a/lib/esp32p4/libesp_audio_front_end.a
+++ b/lib/esp32p4/libesp_audio_front_end.a
--- a/lib/esp32p4/libesp_audio_processor.a
+++ b/lib/esp32p4/libesp_audio_processor.a
--- a/lib/esp32p4/libmultinet.a
+++ b/lib/esp32p4/libmultinet.a
--- a/lib/esp32p4/libvadnet.a
+++ b/lib/esp32p4/libvadnet.a
--- a/lib/esp32p4/libwakenet.a
+++ b/lib/esp32p4/libwakenet.a
--- a/lib/esp32s3/libc_speech_features.a
+++ b/lib/esp32s3/libc_speech_features.a
--- a/lib/esp32s3/libdl_lib.a
+++ b/lib/esp32s3/libdl_lib.a
--- a/lib/esp32s3/libesp_audio_front_end.a
+++ b/lib/esp32s3/libesp_audio_front_end.a
--- a/lib/esp32s3/libesp_audio_processor.a
+++ b/lib/esp32s3/libesp_audio_processor.a
--- a/lib/esp32s3/libflite_g2p.a
+++ b/lib/esp32s3/libflite_g2p.a
--- a/lib/esp32s3/libfst.a
+++ b/lib/esp32s3/libfst.a
--- a/lib/esp32s3/libhufzip.a
+++ b/lib/esp32s3/libhufzip.a
--- a/lib/esp32s3/libmultinet.a
+++ b/lib/esp32s3/libmultinet.a
--- a/lib/esp32s3/libnsnet.a
+++ b/lib/esp32s3/libnsnet.a
--- a/lib/esp32s3/libvadnet.a
+++ b/lib/esp32s3/libvadnet.a
--- a/lib/esp32s3/libwakenet.a
+++ b/lib/esp32s3/libwakenet.a
--- a/test_apps/esp-sr/main/test_multinet.cpp
+++ b/test_apps/esp-sr/main/test_multinet.cpp
@ -235,6 +235,7 @@ TEST_CASE("multinet set commands from sdkconfig and detect", "[mn]")
    esp_mn_error_t *error_phrases = NULL;
    esp_mn_commands_update_from_sdkconfig(multinet, model_data);
    multinet->print_active_speech_commands(model_data);
+    multinet->set_det_threshold(model_data, 0.1);

    while (1) {
        if ((chunks + 1)*audio_chunksize <= data_size) {
--- a/test_apps/esp-sr/sdkconfig.ci.p4_afe
+++ b/test_apps/esp-sr/sdkconfig.ci.p4_afe
@ -5,19 +5,21 @@ CONFIG_IDF_TARGET="esp32p4"
 CONFIG_ESPTOOLPY_FLASHMODE_QIO=y
 CONFIG_ESPTOOLPY_FLASHSIZE_16MB=y
 CONFIG_PARTITION_TABLE_CUSTOM=y
+CONFIG_SR_NSN_NSNET2=y
 CONFIG_SR_VADN_VADNET1_MEDIUM=y
 CONFIG_SR_WN_WN9_HILEXIN=y
-CONFIG_SR_NSN_NSNET2=y
-CONFIG_SPIRAM=y
-CONFIG_ESP_TASK_WDT_EN=n
-CONFIG_ESP_TASK_WDT_INIT=n
-CONFIG_ESP_MAIN_TASK_STACK_SIZE=10240
 CONFIG_COMPILER_OPTIMIZATION_PERF=y
+CONFIG_COMPILER_ORPHAN_SECTIONS_PLACE=y
 CONFIG_ESP32P4_REV_MIN_0=y
 CONFIG_SPIRAM=y
 CONFIG_SPIRAM_SPEED_200M=y
 CONFIG_CACHE_L2_CACHE_256KB=y
 CONFIG_CACHE_L2_CACHE_LINE_128B=y
 CONFIG_ESP_SYSTEM_ALLOW_RTC_FAST_MEM_AS_HEAP=n
+CONFIG_ESP_MAIN_TASK_STACK_SIZE=10240
+CONFIG_ESP_TASK_WDT_EN=n
+CONFIG_LOG_COLORS=y
+CONFIG_LWIP_HOOK_IP6_INPUT_NONE=y
 CONFIG_MBEDTLS_CMAC_C=y
-CONFIG_IDF_EXPERIMENTAL_FEATURES=y
+CONFIG_MBEDTLS_ECP_FIXED_POINT_OPTIM=y
+CONFIG_IDF_EXPERIMENTAL_FEATURES=y