docs: update AFE docs

This commit is contained in:
xysun 2025-02-11 18:05:21 +08:00
parent 805b20e083
commit 0e62b69a71
3 changed files with 24 additions and 21 deletions

View File

@ -67,15 +67,17 @@ The ``input_format`` parameter specifies the arrangement of audio channels in th
+-----------+---------------------+
**Example:**
- ``"MMNR"``: Indicates four channels: two microphone channels, one unused channel, and one playback reference channel.
``"MMNR"`` Indicates four channels: two microphone channels, one unused channel, and one playback reference channel.
**Key Points:**
- The input data must be arranged in **channel-interleaved format**.
.. note::
The input data must be arranged in **channel-interleaved format**.
Using the AFE Framework
----------------------------
Based on the ``menuconfig`` -> ``ESP Speech Recognition``, select the required AFE (Analog Front End) models, such as the WakeNet model, VAD (Voice Activity Detection) model, NS (Noise Suppression) model, etc., and then call the AFE framework in the code using the following steps.
For reference, you can check the code in :project_file:`test_apps/esp-sr/main/test_afe.cpp`.
Step 1: Initialize AFE Configuration
@ -88,10 +90,10 @@ Get the default configuration using ``afe_config_init()`` and customize paramete
srmodel_list_t *models = esp_srmodel_init("model");
afe_config_t *afe_config = afe_config_init("MMNR", models, AFE_TYPE_SR, AFE_MODE_HIGH_PERF);
- **``input_format``**: Define the channel arrangement (e.g., ``"MMNR"``).
- **``models``**: List of models (e.g., for NS, VAD, or WakeNet).
- **``afe_type``**: Type of AFE (e.g., ``AFE_TYPE_SR`` for speech recognition).
- **``afe_mode``**: Performance mode (e.g., ``AFE_MODE_HIGH_PERF``).
- ``input_format``: Define the channel arrangement (e.g., ``MMNR``).
- ``models``: List of models (e.g., for NS, VAD, or WakeNet).
- ``afe_type``: Type of AFE (e.g., ``AFE_TYPE_SR`` for speech recognition).
- ``afe_mode``: Performance mode (e.g., ``AFE_MODE_HIGH_PERF``).
Step 2: Create AFE Instance
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -117,9 +119,9 @@ Input audio data to the AFE for processing. The input data must match the ``inpu
int16_t *feed_buff = (int16_t *) malloc(feed_chunksize * feed_nch * sizeof(int16_t));
afe_handle->feed(afe_data, feed_buff);
- **``feed_chunksize``**: Number of samples to feed per frame.
- **``feed_nch``**: Number of channel of input data.
- **``feed_buff``**: Channel-interleaved audio data (16-bit signed, 16 kHz).
- ``feed_chunksize``: Number of samples to feed per frame.
- ``feed_nch``: Number of channel of input data.
- ``feed_buff``: Channel-interleaved audio data (16-bit signed, 16 kHz).
Step 4: Fetch Processed Audio
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

View File

@ -51,7 +51,7 @@ Resource Consumption
- **MMNR:** two microphone channels and one playback channels
- **Models:** nsnet2, vadnet1_medium, wn9_hilexin
.. list-table:: ESP32-S3 AFE configuration and Performance
.. list-table:: AFE configuration and Performance
:widths: 25 15 15 20 20
:header-rows: 1

View File

@ -64,10 +64,11 @@ AFE 声学前端算法框架
+-----------+---------------------+
**示例:**
- ``"MMNR"``:表示四通道排列,包含两个麦克风通道、一个未使用通道和一个播放参考通道。
``"MMNR"``:表示四通道排列,包含两个麦克风通道、一个未使用通道和一个播放参考通道。
**关键点:**
- 输入数据必须采用 **通道交错排列格式**
.. note::
输入数据必须采用 **通道交错排列格式**
使用AFE框架
----------------------------
@ -84,10 +85,10 @@ AFE 声学前端算法框架
srmodel_list_t *models = esp_srmodel_init("model");
afe_config_t *afe_config = afe_config_init("MMNR", models, AFE_TYPE_SR, AFE_MODE_HIGH_PERF);
- **``input_format``**:定义通道排列(如 ``"MMNR"``)。
- **``models``**模型列表如NS、VAD或WakeNet模型
- **``afe_type``**AFE类型``AFE_TYPE_SR`` 表示语音识别场景)。
- **``afe_mode``**:性能模式(如 ``AFE_MODE_HIGH_PERF`` 表示高性能模式)。
- ``input_format``:定义通道排列(如 ``MMNR``)。
- ``models``模型列表如NS、VAD或WakeNet模型
- ``afe_type``AFE类型``AFE_TYPE_SR`` 表示语音识别场景)。
- ``afe_mode``:性能模式(如 ``AFE_MODE_HIGH_PERF`` 表示高性能模式)。
步骤2创建AFE实例
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -113,9 +114,9 @@ AFE 声学前端算法框架
int16_t *feed_buff = (int16_t *) malloc(feed_chunksize * feed_nch * sizeof(int16_t));
afe_handle->feed(afe_data, feed_buff);
- **``feed_chunksize``**:每帧输入的样本数。
- **``feed_nch``**:输入数据的通道数。
- **``feed_buff``**通道交错的音频数据16位有符号16 kHz
- ``feed_chunksize``:每帧输入的样本数。
- ``feed_nch``:输入数据的通道数。
- ``feed_buff``通道交错的音频数据16位有符号16 kHz
步骤4获取处理结果
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^