docs(AFE): Update AFE docs

2025-09-15 15:28:44 +08:00 · 2021-11-05 18:30:24 +08:00 · 2021-11-05 18:30:24 +08:00 · 04ae02fea2
commit 04ae02fea2
parent a2bc8a64d9
4 changed files with 245 additions and 190 deletions
--- a/docs/audio_front_end/README.md
+++ b/docs/audio_front_end/README.md
@ -20,10 +20,10 @@ The workflow of Espressif AFE can be divided into four parts:

 - AFE creation and initialization
 - AFE feed: Input audio data and will run AEC in the feed function
- Internal BSS, NS algorithms
+- Internal BSS/NS algorithms
 - AFE fetch: Return the audio data after processing and the output value. the AFE fetch will perform VAD internally. If you configure WakeNet to be 'enabled', WakeNet wil do wake-word detection

-**Note:** `afe->feed()` and `afe->fetch()` are visible to users, while `internal BSS task` is invisible to users.
+**Note:** `afe->feed()` and `afe->fetch()` are visible to users, while `internal BSS/NS task` is invisible to users.

 > AEC runs in `afe->feed()` function;  
 > BSS is an independent task in AFE;  
@ -41,45 +41,12 @@ Espressif AFE supports both single MIC and dual MIC scenarios. The internal task

 		esp_afe_sr_iface_t *afe_handle = &esp_afe_sr_2mic;

-### Select AFE mode
-
- Single MIC
-
-    Espressif AFE single MIC supports 2 working modes: SR_MODE_MONO_LOW_COST, SR_MODE_MONO_MEDIUM_COST.
-
-  - SR_MODE_MONO_LOW_COST
-  
-    It is suitable for mono audio data + one reference audio data, with very low memory consumption and CPU resource consumption. It runs less-complex AEC and less-complex mono NS algorithm.
-
-  - SR_MODE_MONO_MEDIUM_COST
-
-    It is suitable for mono audio data + one reference audio data, with low memory consumption and CPU resource consumption. It runs less-complex AEC and medium-complex mono NS algorithm.
-
-
- Dual MIC
-
-    Espressif AFE dual MIC supports 3 working modes: SR_MODE_STEREO_LOW_COST, SR_MODE_STEREO_MEDIUM, SR_MODE_STEREO_HIGH_PERF.
-
-   - SR_MODE_STEREO_LOW_COST
-
-    It is suitable for two-channel audio data + one reference audio data, it runs less-complex AEC and less-complex BSS.
-
-  - SR_MODE_STEREO_MEDIUM
-
-    It is suitable for two-channel audio data + one reference audio data, it runs high-complexity AEC and less-complex BSS.
-
-  - SR_MODE_STEREO_HIGH_PERF
-
-    It is suitable for two-channel audio data + one reference audio data, it runs high-complexity AEC and high-complexity BSS.
-
 ### Input Audio data

 - AFE single MIC

   - Input audio data format: 16KHz, 16bit, two channels (one is mic data, another is reference data)
-   - The data frame length is 16ms. Users can use `afe->get_feed_chunksize()` to get the number of sampling points needed (the data type of sampling points is int16).
-
-  **Note**: the number of sampling points got by `afe->get_feed_chunksize()` is just the data of one channel.
+   - The data frame length is 32ms. Users can use `afe->get_feed_chunksize()` to get the number of sampling points needed (the data type of sampling points is int16).

 The input data is arranged as follows:

@ -88,16 +55,17 @@ Espressif AFE supports both single MIC and dual MIC scenarios. The internal task
 - AFE dual MIC

   - Input audio data format: 16KHz, 16bit, three channels (two are mic data, another is reference data)
-   - The data frame length is 16ms. Users can use `afe->get_feed_chunksize()` to get the number of sampling points needed (the data type of sampling points is int16).
+   - The data frame length is 32ms. Users can use `afe->get_feed_chunksize()` to get the number of sampling points needed (the data type of sampling points is int16).

  The input data is arranged as follows:

 <img src="../img/AFE_mode_other.png" height = "70" align=center />     

+Note: the converted data size is: `afe->get_feed_chunksize * channel number * sizeof(short)`

 ### AEC Introduction

-The AEC (Acoustic Echo Cancellation) algorithm supports maximum two-channel processing, which can effectively remove the echo in the mic input signal, and help with further speech recognition.
+The AEC (Acoustic Echo Cancellation) algorithm supports maximum two-mic processing, which can effectively remove the echo in the mic input signal, and help with further speech recognition.

 ### NS (noise suppression)

@ -125,7 +93,7 @@ The output audio of AFE is single-channel data. When WakeNet is enabled, AFE wil

 ### 1. Define afe_handle

-`afe_handle` is the handle of AFE operation. Users need to select the corresponding AFE handle according to the single MIC and dual MIC applications.
+`afe_handle ` is the function handle that the user calls the AFE interface. Users need to select the corresponding AFE handle according to the single MIC and dual MIC applications.

 - Single MIC

@ -135,102 +103,162 @@ The output audio of AFE is single-channel data. When WakeNet is enabled, AFE wil

 	    esp_afe_sr_iface_t *afe_handle = &esp_afe_sr_2mic;

-### 2. Create afe_handle
+### 2. Configure AFE

-Use `afe_handle->create()` function to initialize the AFE created in step1.
+Get the configuration of AFE:
+
+	afe_config_t afe_config = AFE_CONFIG_DEFAULT();
+	
+Users can adjust the switch of each algorithm module and its corresponding parameters in macros ` AFE_ CONFIG_ DEFAULT ()`:

 ```
-typedef esp_afe_sr_data_t* (*esp_afe_sr_iface_op_create_t)(afe_sr_mode_t mode, int perferred_core);
-
- param mode              The mode of AFE_SR 
- param perferred_core    The perferred core to be pinned for BSS Task. 
- returns Handle to the AFE_SR data
+#define AFE_CONFIG_DEFAULT() { \
+    .aec_init = true, \
+    .se_init = true, \
+    .vad_init = true, \
+    .wakenet_init = true, \
+    .vad_mode = 3, \
+    .wakenet_model = &WAKENET_MODEL, \
+    .wakenet_coeff = &WAKENET_COEFF, \
+    .wakenet_mode = DET_MODE_2CH_90, \
+    .afe_mode = SR_MODE_HIGH_PERF, \
+    .afe_perferred_core = 0, \
+    .afe_perferred_priority = 5, \
+    .afe_ringbuf_size = 50, \
+    .alloc_from_psram = 1, \
+    .agc_mode = 2, \
+}
 ```

-There are two parameters used as above. Users can set different AFE modes and the number of CPU cores for BSS task in AFE according to the actual application requirements.
+- aec_init: Whether the AEC algorithm is enabled.

-**Note**: ESP32 audio development board, such as ESP32-Lyrat_Mini, AFE mode can only select `SR_MODE_MONO_LOW_COST` or `SR_MODE_MONO_MEDIUM_COST`.
+- se_init: Whether the BSS/NS algorithm is enabled.

-### 3. Set WakeNet
+- vad_init: Whether the VAD algorithm is enabled.

-Two steps to set up WakeNet:
+- wakenet_init: Whether the wake algorithm is enabled.

- Use `make menuconfig` to choose WakeNet model. Please refer to: [WakeNet](https://github.com/espressif/esp-sr/tree/b9504e35485b60524977a8df9ff448ca89cd9d62/wake_word_engine)  
- Call `afe_handle->set_wakenet(afe_data, &WAKENET_MODEL, &WAKENET_COEFF);` to initialize WakeNet.
+- vad_mode: The VAD operating mode. A more aggressive (higher mode) VAD is more.
+
+- wakenet_model/wakenet_coeff/wakenet_mode: Use `make menuconfig` to choose WakeNet model. Please refer to：[WakeNet](https://github.com/espressif/esp-sr/tree/b9504e35485b60524977a8df9ff448ca89cd9d62/wake_word_engine)
+
+- afe_mode: Espressif AFE supports two working modes: SR_MODE_LOW_COST, SR_MODE_HIGH_PERF. See the afe_sr_mode_t enumeration for details.
+
+	- SR_MODE_LOW_COST: The quantified version occupies less resources.
+
+	- SR_MODE_HIGH_PERF: The non-quantified version occupies more resources.
+	
+        **ESP32 only supports SR_MODE_HIGH_PERF;   
+        And ESP32S3 supports both of the modes **
+
+- afe_perferred_core: The internal BSS/NS algorithm of AFE will be running on which CPU core.
+
+- afe_ringbuf_size: The configuration of the internal ringbuf size.
+
+- alloc_from_psram: Whether to allocate memory from external psram first. Three values can be configured:
+
+	- 0: Allocated from internal ram.
+	
+	- 1: Part of memory is allocated from external psram.
+	
+	- 2: Most of memory is allocated from external psram.
+	
+- agc_mode: Configuration for linear audio amplification.
+
+### 3. Create afe_data
+
+The user uses the `afe_handle->create_from_config(&afe_config)` function to obtain the data handle, which will be used internally in afe, and the parameters passed in are the configurations obtained in step 2 above.
+
+```
+/**
+ * @brief Function to initialze a AFE_SR instance
+ * 
+ * @param afe_config        The config of AFE_SR
+ * @returns Handle to the AFE_SR data
+ */
+typedef esp_afe_sr_data_t* (*esp_afe_sr_iface_op_create_from_config_t)(afe_config_t *afe_config);
+```

 ### 4. feed audio data

-After initializing AFE and WakeNet, users need to input audio data into AFE by `afe->feed()` function for processing.
+After initializing AFE and WakeNet, users need to input audio data into AFE by `afe_handle->feed()` function for processing.

 The input audio size and layout format can refer to the step **Input Audio data**.

 ```
+/**
+ * @brief Feed samples of an audio stream to the AFE_SR
+ *
+ *
+ * @param afe   The AFE_SR data handle
+ * 
+ * @param in    The input microphone signal, only support signed 16-bit @ 16 KHZ. The frame size can be queried by the 
+ *              `get_samp_chunksize`. The channel number can be queried `get_channel_num`.
+ * @return      The size of input
+ */
 typedef int (*esp_afe_sr_iface_op_feed_t)(esp_afe_sr_data_t *afe, const int16_t* in);

- param afe   The AFE_SR object to queryq
- param in    The input microphone signal, only support signed 16-bit @ 16 KHZ. The frame size can be queried by the `get_samp_chunksize`. The channel number can be queried `get_channel_num`.
- return      The size of input
-
 ```

 Get the number of audio channels:

- `afe->get_channel_num()` function can provide the number of MIC data channels that need to be put into `afe->feed()` function（ Without reference channel).
+ `afe_handle->get_channel_num()` function can provide the number of MIC data channels that need to be put into `afe_handle->feed()` function（ Without reference channel).

 ```
+/**
+ * @brief Get the channel number of samples that need to be passed to the fetch function
+ * 
+ * @param afe The AFE_SR object to query
+ * @return The amount of channel number
+ */
 typedef int (*esp_afe_sr_iface_op_get_channel_num_t)(esp_afe_sr_data_t *afe);
- param afe The AFE_SR object to query
- return The amount of samples to feed the fetch function
 ```

 ### 5. fetch audio data

-Users can get the processed single-channel audio by `afe->fetch()` function.
+Users can get the processed single-channel audio by `afe_handle->fetch()` function.

-The number of data sampling points of fetch (the data type of sampling point is int16) can be got by `afe->get_fetch_chunksize`.
+The number of data sampling points of fetch (the data type of sampling point is int16) can be got by `afe_handle->get_fetch_chunksize`.

 ```
+/**
+ * @brief Get the amount of each channel samples per frame that need to be passed to the function
+ *
+ * Every speech enhancement AFE_SR processes a certain number of samples at the same time. This function
+ * can be used to query that amount. Note that the returned amount is in 16-bit samples, not in bytes.
+ *
+ * @param afe The AFE_SR object to query
+ * @return The amount of samples to feed the fetch function
+ */
 typedef int (*esp_afe_sr_iface_op_get_samp_chunksize_t)(esp_afe_sr_data_t *afe);
- param afe The AFE_SR object to query
 ```

-Please pay attention to the return value of `afe->fetch()`:  
- -1: noise
- 0: speech
- 1: wake word 1
- 2: wake word 2
+Please pay attention to the return value of `afe_handle->fetch()`: 
+
+- AFE_FETCH_CHANNEL_VERIFIED: Audio channel confirmation (This value is not returned while single microphone wakes up.)
+- AFE_FETCH_NOISE: Noise detected
+- AFE_FETCH_SPEECH: Speech detected
+- AFE_FETCH_WWE_DETECTED: Wakeup detected
 - ...

 ```
-typedef int (*esp_afe_sr_iface_op_fetch_t)(esp_afe_sr_data_t *afe, int16_t* out);
- param afe   The AFE_SR object to query
- param out   The output enhanced signal. The frame size can be queried by the `get_samp_chunksize`.
- return      The style of output, -1: noise, 0: speech, 1: wake word 1, 2: wake word 2, ...
+/**
+ * @brief fetch enhanced samples of an audio stream from the AFE_SR
+ *
+ * @Warning  The output is single channel data, no matter how many channels the input is.
+ *
+ * @param afe   The AFE_SR object to query
+ * @param out   The output enhanced signal. The frame size can be queried by the `get_samp_chunksize`.
+ * @return      The state of output, please refer to the definition of `afe_fetch_mode_t`
+ */
+typedef afe_fetch_mode_t (*esp_afe_sr_iface_op_fetch_t)(esp_afe_sr_data_t *afe, int16_t* out);
 ```

 ### 6. Usage of WakeNet  

-WakeNet in AFE can be used in three ways:
+When users need to perform other operations after wake-up, such as offline or online speech recognitioafe_handlen, they can pause the operation of WakeNet to reduce the CPU resource consumption.

- No WakeNet
-
-Users can choose not to initialize WakeNet if not call:
-
-	afe_handle->set_wakenet(afe_data, &WAKENET_MODEL, &WAKENET_COEFF);
-
- Use WakeNet  
-
-Users need to configure the wake word by `make menuconfig` first. Then call:  
-
-	afe_handle->set_wakenet(afe_data, &WAKENET_MODEL, &WAKENET_COEFF);
-
-In this way, you can use `afe->fetch()` to check wake-up status.
-
- Disable WakeNet after wake-up:
-
-When users need to perform other operations after wake-up, such as offline or online speech recognition, they can pause the operation of WakeNet to reduce the CPU resource consumption.
-
-Users can call `afe->disable_wakenet(afe_data)` to stop WakeNet, or call `afe->enable_wakenet(afe_data)` to enable WakeNet.
+Users can call `afe_handle->disable_wakenet(afe_data)` to stop WakeNet, or call `afe_handle->enable_wakenet(afe_data)` to enable WakeNet.

 ### 7. Usage of AEC

--- a/docs/audio_front_end/README_CN.md
+++ b/docs/audio_front_end/README_CN.md
@ -23,10 +23,10 @@
 - 内部：AFE BSS/NS 算法处理
 - AFE fetch，返回处理过的音频数据和返回值， fetch 内部会进行 VAD 处理，如果用户设置 WakeNet 为 enable 状态，也会进行唤醒词的检测

-其中 `afe->feed()` 和 `afe->fetch()` 对用户可见，`Internal BSS Task` 对用户不可见。
+其中 `afe->feed()` 和 `afe->fetch()` 对用户可见，`Internal BSS/NS Task` 对用户不可见。

 > AEC 在 afe->feed() 函数中运行；  
-> BSS 为 AFE 内部独立 Task 进行处理；  
+> BSS/NS 为 AFE 内部独立 Task 进行处理；  
 > VAD 和 WakeNet 的结果通过 afe->fetch() 函数中获取。

 ### 选择 AFE handle
@ -41,63 +41,31 @@

 		esp_afe_sr_iface_t *afe_handle = &esp_afe_sr_2mic;

-### 选择 AFE mode
-
- 单麦
-
-    乐鑫 AFE 单麦场景目前支持 2 种工作模式，分别为：SR_MODE_MONO_LOW_COST, SR_MODE_MONO_MEDIUM_COST.  
-详细可见 afe_sr_mode_t 结构体。
-
-  - SR_MODE_MONO_LOW_COST  
- 
-    适用于单通道音频数据+一路回采数据，具有很低的内存消耗和 CPU 资源消耗，此时运行低复杂度 AEC 和低复杂度降噪算法。
-
-  - SR_MODE_MONO_MEDIUM_COST
-
-    适用于单通道音频数据+一路回采数据，具有较低的内存消耗和 CPU 资源消耗，此时运行低复杂度 AEC 和中等复杂度降噪算法。
-
- 双麦
-
-    乐鑫 AFE 双麦场景目前支持 3 种工作模式，分别为：SR_MODE_STEREO_LOW_COST, SR_MODE_STEREO_MEDIUM, SR_MODE_STEREO_HIGH_PERF.  
-详细可见 afe_sr_mode_t 结构体。
-
-  - SR_MODE_STEREO_LOW_COST
-
-    适用于双通道音频数据 + 一路回采数据，AEC 采用复杂度较低的算法， BSS 采用低复杂度算法
- 
-  - SR_MODE_STEREO_MEDIUM
-
-    适用于双通道音频数据 + 一路回采数据，AEC 采用复杂度较高的算法， BSS 采样低复杂度算法
- 
-  - SR_MODE_STEREO_HIGH_PERF
-
-    适用于双通道音频数据 + 一路回采数据，AEC 和 BSS 均采用复杂度较高的模式
-
 ### 输入音频

- 当 AFE 单麦场景
+- AFE 单麦场景

-  - 输入音频格式为 16KHz, 16bit, 双通道（1个通道为 mic 数据，另一个通道为参考回路）
-  - 数据帧长为 16ms, 用户可以使用 `afe->get_feed_chunksize` 来获取需要的采样点数目（采样点数据类型为 int16）
-     
-   注意：此处得到数据量大小为单通道音频。
+	- 输入音频格式为 16KHz, 16bit, 双通道（1个通道为 mic 数据，另一个通道为参考回路）
+	- 数据帧长为 32ms, 用户可以使用 `afe->get_feed_chunksize` 来获取需要的采样点数目（采样点数据类型为 int16）
 
 数据排布如下：
 
 <img src="../img/AFE_mode_0.png" height = "100" align=center />  

- 当 AFE 双麦场景
+- AFE 双麦场景

- - 输入音频格式为 16KHz, 16bit, 三通道
- - 数据帧长为 32ms, 用户可以使用 `afe->get_feed_chunksize` 来获取需要填充的数据量
+	- 输入音频格式为 16KHz, 16bit, 三通道
+	- 数据帧长为 32ms, 用户可以使用 `afe->get_feed_chunksize` 来获取需要填充的数据量

 数据排布如下：
 
  <img src="../img/AFE_mode_other.png" height = "70" align=center />  

+注意：换算成数据量大小为：`afe->get_feed_chunksize * 通道数 * sizeof(short)` 
+
 ### AEC 简介

-AEC (Acoustic Echo Cancellation) 算法最多支持双通道处理，能够有效的去除 mic 输入信号中的自身播放回声。从而可以在自身播放音乐的情况下进行很好的语音识别等应用。
+AEC (Acoustic Echo Cancellation) 算法最多支持双麦处理，能够有效的去除 mic 输入信号中的自身播放声音。从而可以在自身播放音乐的情况下进行很好的语音识别等应用。

 ### NS 简介

@ -125,7 +93,7 @@ AFE 的输出音频为单通道数据，在 WakeNet 开启的情况下，AFE 会

 ### 1. 定义 afe_handle

-`afe_handle` 是用户后续使用 afe 操作的相关句柄。用户需要根据单麦和双麦场景选择对应的 `afe_handle`。
+`afe_handle` 是用户后续调用 afe 接口的函数句柄。用户需要根据单麦和双麦场景选择对应的 `afe_handle`。

 单麦场景：

@ -135,104 +103,163 @@ AFE 的输出音频为单通道数据，在 WakeNet 开启的情况下，AFE 会

 	esp_afe_sr_iface_t *afe_handle = &esp_afe_sr_2mic;

-### 2. 创建 afe_handle
+### 2. 配置 afe

-用户使用 `afe_handle->create()` 函数来初始化在第一步中创建的 `afe_handle`。
+获取 afe 的配置：
+
+	afe_config_t afe_config = AFE_CONFIG_DEFAULT();
+
+可在宏`AFE_CONFIG_DEFAULT()`中调整各算法模块的使能及其相应参数: 

 ```
-typedef esp_afe_sr_data_t* (*esp_afe_sr_iface_op_create_t)(afe_sr_mode_t mode, int perferred_core);
-
- param mode              The mode of AFE_SR 
- param perferred_core    The perferred core to be pinned for BSS Task. 
- returns Handle to the AFE_SR data
+#define AFE_CONFIG_DEFAULT() { \
+    .aec_init = true, \
+    .se_init = true, \
+    .vad_init = true, \
+    .wakenet_init = true, \
+    .vad_mode = 3, \
+    .wakenet_model = &WAKENET_MODEL, \
+    .wakenet_coeff = &WAKENET_COEFF, \
+    .wakenet_mode = DET_MODE_2CH_90, \
+    .afe_mode = SR_MODE_HIGH_PERF, \
+    .afe_perferred_core = 0, \
+    .afe_perferred_priority = 5, \
+    .afe_ringbuf_size = 50, \
+    .alloc_from_psram = 1, \
+    .agc_mode = 2, \
+}
 ```

-调用 `afe_handle->create()` 时使用的两个形参如上。用户可以根据实际应用的需求来设置不同的 AFE 模式和 AFE 内部 BSS Task 运行的 CPU 核数。
+- aec_init: AEC 算法是否使能。

-注意：ESP32 系列的音频开发板，例如 ESP32-LyraT-Mini，AFE 模式只能选择 `SR_MODE_MONO_LOW_COST` 或者 `SR_MODE_MONO_MEDIUM_COST`， 即单通道模式。
+- se_init: BSS/NS 算法是否使能。

-### 3. 设置 WakeNet
+- vad_init: VAD 是否使能。

-对用户而言，设置 WakeNet 可以分为两步：  
- 使用 `make menuconfig` 来选择相应的唤醒模型，详见：[WakeNet](https://github.com/espressif/esp-sr/tree/b9504e35485b60524977a8df9ff448ca89cd9d62/wake_word_engine)
+- wakenet_init: 唤醒是否使能。

- 调用 `afe_handle->set_wakenet(afe_data, &WAKENET_MODEL, &WAKENET_COEFF);` 来初始化 WakeNet.
+- vad_mode: VAD 检测的操作模式，越大越激进。
+
+- wakenet_model/wakenet_coeff/wakenet_mode: 使用 `make menuconfig` 来选择相应的唤醒模型，详见：[WakeNet](https://github.com/espressif/esp-sr/tree/b9504e35485b60524977a8df9ff448ca89cd9d62/wake_word_engine)
+
+- afe_mode: 乐鑫 AFE 目前支持 2 种工作模式，分别为：SR_MODE_LOW_COST, SR_MODE_HIGH_PERF。详细可见 afe_sr_mode_t 枚举。
+
+	- SR_MODE_LOW_COST: 量化版本，占用资源较少。
+
+	- SR_MODE_HIGH_PERF: 非量化版本，占用资源较多。
+	
+        **ESP32 芯片，只支持模式 SR_MODE_HIGH_PERF;   
+        ESP32S3 芯片，两种模式均支持 **
+
+- afe_perferred_core: AFE 内部 BSS/NS 算法，运行在哪个 CPU 核。
+
+- afe_ringbuf_size: 内部 ringbuf 大小的配置。
+
+- alloc_from_psram: 是否优先从外部 psram 分配内存。可配置三个值：
+
+	- 0: 从内部ram分配。
+	
+	- 1: 部分从外部psram分配。
+	
+	- 2: 绝大部分从外部psram分配
+	
+- agc_mode: 将音频线性放大的 level 配置（[0,3]）,0 表示无放大
+
+### 3. 创建 afe_data
+
+用户使用 `afe_handle->create_from_config(&afe_config)` 函数来获得数据句柄，这将会在afe内部使用，传入的参数即为上面第2步中获得的配置。
+
+```
+/**
+ * @brief Function to initialze a AFE_SR instance
+ * 
+ * @param afe_config        The config of AFE_SR
+ * @returns Handle to the AFE_SR data
+ */
+typedef esp_afe_sr_data_t* (*esp_afe_sr_iface_op_create_from_config_t)(afe_config_t *afe_config);
+
+```

 ### 4. feed 音频数据

-在初始化 AFE 和 WakeNet 完成后，用户需要将音频数据使用 `afe->feed()` 函数输入到 AFE 中进行处理。
+在初始化 AFE 和 WakeNet 完成后，用户需要将音频数据使用 `afe_handle->feed()` 函数输入到 AFE 中进行处理。

 输入的音频大小和排布格式可以参考 **输入音频** 这一步骤。

 ```
+/**
+ * @brief Feed samples of an audio stream to the AFE_SR
+ *
+ *
+ * @param afe   The AFE_SR data handle
+ * 
+ * @param in    The input microphone signal, only support signed 16-bit @ 16 KHZ. The frame size can be queried by the 
+ *              `get_samp_chunksize`. The channel number can be queried `get_channel_num`.
+ * @return      The size of input
+ */
 typedef int (*esp_afe_sr_iface_op_feed_t)(esp_afe_sr_data_t *afe, const int16_t* in);

- param afe   The AFE_SR object to queryq
- param in    The input microphone signal, only support signed 16-bit @ 16 KHZ. The frame size can be queried by the `get_samp_chunksize`. The channel number can be queried `get_channel_num`.
- return      The size of input
-
 ```

 获取音频通道数：

-使用 `afe->get_channel_num()` 函数可以获取需要传入 `afe->feed()` 函数的 mic 数据通道数。（不含参考回路通道）
+使用 `afe_handle->get_channel_num()` 函数可以获取需要传入 `afe_handle->feed()` 函数的 mic 数据通道数。（不含参考回路通道）

 ```
+/**
+ * @brief Get the channel number of samples that need to be passed to the fetch function
+ * 
+ * @param afe The AFE_SR object to query
+ * @return The amount of channel number
+ */
 typedef int (*esp_afe_sr_iface_op_get_channel_num_t)(esp_afe_sr_data_t *afe);
- param afe The AFE_SR object to query
- return The amount of samples to feed the fetch function
 ```

 ### 5. fetch 音频数据

-用户调用 `afe->fetch()` 函数可以获取处理完成的单通道音频。  
+用户调用 `afe_handle->fetch()` 函数可以获取处理完成的单通道音频。  

-fetch 的数据采样点数目（采样点数据类型为 int16）可以通过 `afe->get_fetch_chunksize` 获取。
+fetch 的数据采样点数目（采样点数据类型为 int16）可以通过 `afe_handle->get_fetch_chunksize` 获取。

 ```
+/**
+ * @brief Get the amount of each channel samples per frame that need to be passed to the function
+ *
+ * Every speech enhancement AFE_SR processes a certain number of samples at the same time. This function
+ * can be used to query that amount. Note that the returned amount is in 16-bit samples, not in bytes.
+ *
+ * @param afe The AFE_SR object to query
+ * @return The amount of samples to feed the fetch function
+ */
 typedef int (*esp_afe_sr_iface_op_get_samp_chunksize_t)(esp_afe_sr_data_t *afe);
- param afe The AFE_SR object to query
- param out   The output enhanced signal. The frame size can be queried by the `get_samp_chunksize`.
- return      The style of output, -1: noise, 0: speech, 1: wake word 1, 2: wake word 2, ...
 ```

-用户需要注意 `afe->fetch()` 的返回值：
- -1: noise
- 0: speech
- 1: wake word 1
- 2: wake word 2
+用户需要注意 `afe_handle->fetch()` 的返回值：
+
+- AFE_FETCH_CHANNEL_VERIFIED: 音频通道确认 (单麦唤醒，不返回该值)
+- AFE_FETCH_NOISE: 侦测到噪声
+- AFE_FETCH_SPEECH: 侦测到语音
+- AFE_FETCH_WWE_DETECTED: 侦测到唤醒词
 - ...

 ```
-typedef int (*esp_afe_sr_iface_op_fetch_t)(esp_afe_sr_data_t *afe, int16_t* out);
- param afe   The AFE_SR object to query
- param out   The output enhanced signal. The frame size can be queried by the `get_samp_chunksize`.
- return      The style of output, -1: noise, 0: speech, 1: wake word 1, 2: wake word 2, ...
+/**
+ * @brief fetch enhanced samples of an audio stream from the AFE_SR
+ *
+ * @Warning  The output is single channel data, no matter how many channels the input is.
+ *
+ * @param afe   The AFE_SR object to query
+ * @param out   The output enhanced signal. The frame size can be queried by the `get_samp_chunksize`.
+ * @return      The state of output, please refer to the definition of `afe_fetch_mode_t`
+ */
+typedef afe_fetch_mode_t (*esp_afe_sr_iface_op_fetch_t)(esp_afe_sr_data_t *afe, int16_t* out);
 ```

 ### 6. WakeNet 使用

-用户使用 AFE 中 WakeNet 大体可以分为以下三种情况：
-
- 不使用 WakeNet
-
-当用户不使用 WakeNet 时可以选择不初始化 WakeNet，即不需要调用：  
-
-	afe_handle->set_wakenet(afe_data, &WAKENET_MODEL, &WAKENET_COEFF);
-
- 使用 WakeNet
-
-用户使用 WakeNet 则需要先使用 `make menuconfig` 来配置相应的唤醒词信息。然后调用：  
-
-	afe_handle->set_wakenet(afe_data, &WAKENET_MODEL, &WAKENET_COEFF);
-    
-则可以通过 `afe->fetch()` 函数来获取是否识别到唤醒词。
-
- 使用 WakeNet 但是在唤醒后暂时停止 WakeNet
-
 当用户在唤醒后需要进行其他操作，比如离线或在线语音识别，这时候可以暂停 WakeNet 的运行，从而减轻 CPU 的资源消耗。  

-用户可以调用 `afe->disable_wakenet(afe_data)` 来停止 WakeNet。 当后续应用结束后又可以调用 `afe->enable_wakenet(afe_data)` 来开启 WakeNet。
+用户可以调用 `afe_handle->disable_wakenet(afe_data)` 来停止 WakeNet。 当后续应用结束后又可以调用 `afe_handle->enable_wakenet(afe_data)` 来开启 WakeNet。

 ### 7. AEC 使用

--- a/docs/img/AFE_overview.png
+++ b/docs/img/AFE_overview.png
--- a/docs/img/AFE_workflow.png
+++ b/docs/img/AFE_workflow.png