diff --git a/.gitignore b/.gitignore
index 623a9de..379c609 100644
--- a/.gitignore
+++ b/.gitignore
@@ -6,6 +6,7 @@ include/sdkconfig.h
 build/
 sdkconfig.old
 sdkconfig
+<<<<<<< HEAD
 .DS_Store
 
 *.pyc
@@ -24,3 +25,8 @@ docs/doxygen_sqlite3.db
 # Downloaded font files
 docs/_static/DejaVuSans.ttf
 docs/_static/NotoSansSC-Regular.otf
+=======
+model/target/*
+.vscode
+docs/_build/*
+>>>>>>> 0981bc8425d6cace35ebb73789265a1c2e14dc92
diff --git a/docs/__pycache__/conf_common.cpython-37.pyc b/docs/__pycache__/conf_common.cpython-37.pyc
index 11c070f..9c40f58 100644
Binary files a/docs/__pycache__/conf_common.cpython-37.pyc and b/docs/__pycache__/conf_common.cpython-37.pyc differ
diff --git a/docs/_static/QR_Dilated_Convolution.png b/docs/_static/QR_Dilated_Convolution.png
new file mode 100644
index 0000000..78705b4
Binary files /dev/null and b/docs/_static/QR_Dilated_Convolution.png differ
diff --git a/docs/_static/QR_MFCC.png b/docs/_static/QR_MFCC.png
new file mode 100644
index 0000000..2d325ab
Binary files /dev/null and b/docs/_static/QR_MFCC.png differ
diff --git a/docs/_static/QR_multinet_g2p.png b/docs/_static/QR_multinet_g2p.png
new file mode 100644
index 0000000..98e3736
Binary files /dev/null and b/docs/_static/QR_multinet_g2p.png differ
diff --git a/docs/conf_common.py b/docs/conf_common.py
index 50cff10..5bed85b 100755
--- a/docs/conf_common.py
+++ b/docs/conf_common.py
@@ -23,7 +23,6 @@ ESP32_DOCS = ['audio_front_end/README.rst',
               'wake_word_engine/README.rst',
               'wake_word_engine/ESP_Wake_Words_Customization.rst',
               'speech_command_recognition/README.rst',
-              'acoustic_algorithm/README.rst',
               'flash_model/README.rst',
               'audio_front_end/Espressif_Microphone_Design_Guidelines.rst',
               'test_report/README.rst',
diff --git a/docs/convert-md-to-rst.sh b/docs/convert-md-to-rst.sh
deleted file mode 100755
index 6a368e7..0000000
--- a/docs/convert-md-to-rst.sh
+++ /dev/null
@@ -1,27 +0,0 @@
-#!/bin/bash
-
-function convert_md2rst(){
-    for files in $1/$2/*
-    do
-        filename="$(basename -- $files)"
-        echo $filename
-        fname="${filename%.*}"
-        echo $fname
-        echo "converting $fname"
-        pandoc $1/$2/$filename -f markdown -t rst -s -o "$1/$2/${fname}".rst
-    done
-}
-
-convert_md2rst en acoustic_algorithm
-convert_md2rst en audio_front_end
-convert_md2rst en flash_model
-convert_md2rst en performance_test
-convert_md2rst en speech_command_recognition
-convert_md2rst en wake_word_engine
-
-convert_md2rst zh_cn acoustic_algorithm
-convert_md2rst zh_cn audio_front_end
-convert_md2rst zh_cn flash_model
-convert_md2rst zh_cn performance_test
-convert_md2rst zh_cn speech_command_recognition
-convert_md2rst zh_cn wake_word_engine
\ No newline at end of file
diff --git a/docs/en/acoustic_algorithm/README.rst b/docs/en/acoustic_algorithm/README.rst
deleted file mode 100644
index 38d10a6..0000000
--- a/docs/en/acoustic_algorithm/README.rst
+++ /dev/null
@@ -1,248 +0,0 @@
-Acoustic Algorithm Introduction
-===============================
-
-:link_to_translation:`zh_CN:[中文]`
-
-Acoustic algorithms provided in esp-sr include voice activity detection(VAD), adaptive gain control (AGC), acoustic echo cancellation (AEC),noise suppression (NS), and mic-array speech enhancement (MASE). VAD, AGC, AEC, and NS are supported with either single-mic and multi-mic development board, MASE is supported with multi-mic board only.
-
-VAD
----
-
-Overview
-~~~~~~~~
-
-VAD takes an audio stream as input, and outputs the prediction that a frame of the stream contains audio or not.
-
-API Reference
-~~~~~~~~~~~~~
-
-Header
-^^^^^^
-
--  esp_vad.h
-
-Function
-^^^^^^^^
-
--  ``vad_handle_t vad_create(vad_mode_t vad_mode)``
-
-    **Definition**
-
-    Initialization of VAD handle.
-
-    **Parameter**
-
-    -  vad_mode: operating mode of VAD, VAD_MODE_0 to VAD_MODE_4, larger value indicates more aggressive VAD.
-
-    **Return**
-
-    Handle to VAD.
-
--  ``vad_state_t vad_process(vad_handle_t inst, int16_t *data, int sample_rate_hz, int one_frame_ms);``
-
-    **Definition**
-
-    Processing of VAD for one frame.
-
-    **Parameter**
-
-    -  inst: VAD handle.
-    -  data: buffer to save both input and output audio stream.
-    -  sample_rate_hz: The Sampling frequency (Hz) can be 32000, 16000, 8000, default: 16000.
-    -  one_frame_ms: The length of the audio processing can be 10ms, 20ms, 30ms, default: 30.
-
-    **Return**
-
-    -  VAD_SILENCE if no voice
-    -  VAD_SPEECH if voice is detected
-
--  ``void vad_destroy(vad_handle_t inst)``
-
-    **Definition**
-
-    Destruction of a VAD handle.
-
-    **Parameter**
-
-    -  inst: the VAD handle to be destroyed.
-
-AGC
----
-
-.. _overview-1:
-
-Overview
-~~~~~~~~
-
-AGC keeps the volume of audio signal at a stable level to avoid the situation that the signal is so loud that gets clipped or too quiet to trigger the speech recognizer.
-
-.. _api-reference-1:
-
-API Reference
-~~~~~~~~~~~~~
-
--  ``void *esp_agc_open(int agc_mode, int sample_rate)``
-
-    **Definition**
-
-    Initialization of AGC handle.
-
-    **Parameter**
-
-    -  agc_mode: operating mode of AGC, 3 to enable AGC and 0 to disable it.
-    -  sample_rate: sampling rate of audio signal.
-
-    **Return**
-
-    -  AGC handle.
-
--  ``int esp_agc_process(void *agc_handle, short *in_pcm, short *out_pcm, int frame_size, int sample_rate)``
-
-    **Definition**
-
-    Pocessing of AGC for one frame.
-
-    **Parameter**
-
-    -  agc_handle: AGC handle.
-    -  in_pcm: input audio stream.
-    -  out_pcm: output audio stream.
-    -  frame_size: signal frame length in ms.
-    -  sample_rate: signal sampling rate in Hz.
-
-    **Return**
-
-    Return 0 if AGC processing succeeds, -1 if fails; -2 and -3 indicate invalid input of sample_rate and frame_size, respectively.
-
--  ``void esp_agc_clse(void *agc_handle)``
-
-    **Definition**
-
-    Destruction of an AGC handle.
-
-    **Parameter**
-
-    -  agc_handle: the AGC handle to be destroyed.
-
-AEC
----
-
-.. _overview-2:
-
-Overview
-~~~~~~~~
-
-AEC suppresses echo of the sound played by the speaker of the board.
-
-.. _api-reference-2:
-
-API Reference
-~~~~~~~~~~~~~
-
--  ``aec_handle_t aec_create(int sample_rate, int frame_length, int filter_length)``
-
-    **Definition**
-
-    Initialization of AEC handle.
-
-    **Parameter**
-
-    -  sample_rate: audio signal sampling rate.
-    -  frame_length: audio frame length in ms.
-    -  filter_length: the length of adaptive filter in AEC.
-
-    **Return**
-
-    Handle to AEC.
-
--  ``aec_create_t aec_create_multimic(int sample_rate, int frame_length, int filter_length, int nch)``
-
-    **Definition**
-
-    Initialization of AEC handle.
-
-    **Parameter**
-
-    -  sample_rate: audio signal sampling rate.
-    -  frame_length: audio frame length in ms.
-    -  filter_length: the length of adaptive filter in AEC.
-    -  nch: number of channels of the signal to be processed.
-
-    **Return**
-
-    Handle to AEC.
-
--  ``void aec_process(aec_handle_t inst, int16_t *indata, int16_t *refdata, int16_t *outdata)``
-
-    **Definition**
-
-    Processing of AEC for one frame.
-
-    **Parameter**
-
-    -  inst: AEC handle.
-    -  indata: input audio stream, which could be single- or multi-channel, depending on the channel number defined on initialization.
-    -  refdata: reference signal to be cancelled from the input.
-    -  outdata: output audio stream, the number of channels is the same as indata.
-
--  ``void aec_destroy(aec_handle_t inst)``
-
-    **Definition**
-
-    Destruction of an AEC handle.
-
-    **Parameter**
-
-    -  inst: the AEC handle to be destroyed.
-
-NS
---
-
-.. _overview-3:
-
-Overview
-~~~~~~~~
-
-Single-channel speech enhancement. If multiple mics are available with the board, MASE is recommened for noise suppression.
-
-.. _api-reference-3:
-
-API Reference
-~~~~~~~~~~~~~
-
--  ``ns_handle_t ns_pro_create(int frame_length, int mode)``
-
-    **Definition**
-
-    Creates an instance of the more powerful noise suppression algorithm.
-
-    **Parameter**
-
-    -  frame_length_ms: audio frame length in ms.
-    -  mode: 0: Mild, 1: Medium, 2: Aggressive
-
-    **Return**
-
-    Handle to NS.
-
--  ``void ns_process(ns_handle_t inst, int16_t *indata, int16_t *outdata)``
-
-    **Definition**
-
-    Prodessing of NS for one frame.
-
-    **Parameter**
-
-    -  inst: NS handle.
-    -  indata: input audio stream.
-    -  outdata: output audio stream.
-
--  ``void ns_destroy(ns_handle_t inst)``
-
-    **Definition**
-
-    Destruction of a NS handle.
-
-    **Parameter**
-
-    -  inst: the NS handle to be destroyed.
diff --git a/docs/en/index.rst b/docs/en/index.rst
index 0a8f648..526efda 100644
--- a/docs/en/index.rst
+++ b/docs/en/index.rst
@@ -21,7 +21,6 @@ Based on years of hardware design and development experience, Loxin can provide
     Wake word model <wake_word_engine/README>
     Customized wake words <wake_word_engine/ESP_Wake_Words_Customization>
     Speech commands <speech_command_recognition/README>
-    Acoustic algorithm introduction <acoustic_algorithm/README>
     Model loading method <flash_model/README>
     Microphone Design Guidelines <audio_front_end/Espressif_Microphone_Design_Guidelines>
     Test Reports <test_report/README>
diff --git a/docs/zh_CN/acoustic_algorithm/README.rst b/docs/zh_CN/acoustic_algorithm/README.rst
deleted file mode 100644
index 88d90ce..0000000
--- a/docs/zh_CN/acoustic_algorithm/README.rst
+++ /dev/null
@@ -1,248 +0,0 @@
-声学算法介绍
-============
-
-:link_to_translation:`en:[English]`
-
-esp-sr 中提供的声学算法包括语音活动检测 (VAD)、自适应增益控制 (AGC)、声学回声消除 (AEC)、噪声抑制 (NS) 和麦克风阵列语音增强 (MASE)。 VAD、AGC、AEC 和 NS 支持单麦克风和多麦克风开发板，MASE 仅支持多麦克风板。
-
-VAD
----
-
-概述
-~~~~
-
-VAD将一个音频流作为输入，并输出该流的某一帧是否包含音频的预测。
-
-API 参考
-~~~~~~~~~~~~~
-
-头文件
-^^^^^^
-
--  esp_vad.h
-
-函数
-^^^^
-
--  ``vad_handle_t vad_create(vad_mode_t vad_mode)``
-
-    **定义**
-
-    VAD 句柄的初始化。
-
-    **范围**
-
-    -  vad_mode：VAD的工作模式，VAD_MODE_0到VAD_MODE_4，数值越大表示VAD越激进。
-
-    **返回值**
-
-    vad_handle_t
-
--  ``vad_state_t vad_process(vad_handle_t inst, int16_t *data, int sample_rate_hz, int one_frame_ms);``
-
-    **定义**
-
-    处理一帧的 VAD。
-
-    **范围**
-
-    - inst：VAD句柄。
-    - data: 保存输入和输出音频流的缓冲区。
-    - sample_rate_hz: 采样频率（Hz）可以是32000、16000、8000，默认是16000。
-    - one_frame_ms: 音频处理的长度可以是10ms、20ms、30ms，默认：30。
-
-    **返回值**
-
-    -  VAD_SILENCE if no voice
-    -  VAD_SPEECH if voice is detected
-
--  ``void vad_destroy(vad_handle_t inst)``
-
-    **定义**
-
-    -  销毁 VAD 句柄.
-
-    **范围**
-
-    -  inst：要销毁的VAD句柄。
-
-AGC
----
-
-.. _overview-1:
-
-概述
-~~~~~~~~
-
-AGC将音频信号的音量保持在一个稳定的水平，以避免信号过大而被削掉或过小而无法触发语音识别器的情况。
-
-.. _api-reference-1:
-
-API 参考
-~~~~~~~~~~~~~
-
--  ``void *esp_agc_open(int agc_mode, int sample_rate)``
-
-    **定义**
-
-    AGC句柄的初始化。  
-
-    **范围**
-
-    - agc_mode：AGC的工作模式，3表示启用AGC，0表示禁用。
-    - sample_rate：音频信号的采样率。
-
-    **返回值**
-
-    -  AGC 句柄.
-
--  ``int esp_agc_process(void *agc_handle, short *in_pcm, short *out_pcm, int frame_size, int sample_rate)``
-
-    **定义**
-
-    对一帧的AGC进行分配。
-
-    **范围**
-
-    - agc_handle: AGC手柄。
-    - in_pcm: 输入音频流。
-    - out_pcm：输出音频流。
-    - frame_size: 信号帧的长度，单位是ms。
-    - sample_rate：信号的采样率，单位为Hz。
-
-    **返回值**
-
-    - 返回 0 如果 AGC processing 成功, -1 如果失败; -2 和 -3 分别表示采样率和帧大小的无效输入。
-
--  ``void esp_agc_clse(void *agc_handle)``
-
-    **定义**
-
-    - 销毁一个AGC句柄。
-
-    **范围**
-
-    -  agc_handle: 销毁AGC句柄。
-
-AEC
----
-
-.. _overview-2:
-
-概述
-~~~~~~~~
-
-AEC抑制了电路板上的扬声器所播放的声音的回声。
-
-.. _api-reference-2:
-
-API 参考
-~~~~~~~~~~~~~
-
--  ``aec_handle_t aec_create(int sample_rate, int frame_length, int filter_length)``
-
-    **定义**
-
-    AEC 句柄的初始化。
-
-    **范围**
-
-    -  sample_rate: audio signal sampling rate.
-    -  frame_length: audio frame length in ms.
-    -  filter_length: the length of adaptive filter in AEC.
-
-    **返回值**
-
-    Handle to AEC.
-
--  ``aec_create_t aec_create_multimic(int sample_rate, int frame_length, int filter_length, int nch)``
-
-    **定义**
-
-    AEC 句柄的初始化。
-
-    **范围**
-
-    - sample_rate：音频信号采样率。
-    - frame_length：以毫秒为单位的音频帧长度。
-    - filter_length：AEC 中自适应滤波器的长度。
-    - nch：要处理的信号的通道数。
-
-    **返回值**
-
-    Handle to AEC.
-
--  ``void aec_process(aec_handle_t inst, int16_t *indata, int16_t *refdata, int16_t *outdata)``
-
-    **定义**
-
-    一帧的AEC处理。
-
-    **范围**
-
-    - inst：AEC 手柄。
-    - indata：输入音频流，可以是单声道或多声道，取决于初始化时定义的声道号。
-    - refdata：要从输入中取消的参考信号。
-    - outdata：输出音频流，通道数与indata相同。
-
--  ``void aec_destroy(aec_handle_t inst)``
-
-    **定义**
-
-    AEC 句柄的破坏。
-
-    **范围**
-
-    -inst：要销毁的 AEC 句柄。
-
-NS
---
-
-.. _overview-3:
-
-概述
-~~~~~~~~
-
-单通道语音增强。如果电路板上有多个麦克风可用，建议使用 MASE 进行噪声抑制。
-
-.. _api-reference-3:
-
-API 参考
-~~~~~~~~~~~~~
-
--  ``ns_handle_t ns_pro_create(int frame_length, int mode)``
-
-    **定义**
-
-    创建更强大的噪声抑制算法的实例。
-
-    **范围**
-
-    - frame_length_ms：以毫秒为单位的音频帧长度。
-    - mode：0：轻度，1：中度，2：激进
-
-    **返回值**
-
-    Handle to NS.
-
--  ``void ns_process(ns_handle_t inst, int16_t *indata, int16_t *outdata)``
-
-    **定义**
-
-    NS 处理一帧。
-
-    **范围**
-
-    - inst：NS 句柄。
-    - indata：输入音频流。
-    - outdata：输出音频流。
-
--  ``void ns_destroy(ns_handle_t inst)``
-
-    **定义**
-
-    NS句柄的破坏。
-
-    **范围**
-
-    - inst：要销毁的 NS 句柄。
diff --git a/docs/zh_CN/audio_front_end/README.rst b/docs/zh_CN/audio_front_end/README.rst
index 4475b73..d2716f8 100644
--- a/docs/zh_CN/audio_front_end/README.rst
+++ b/docs/zh_CN/audio_front_end/README.rst
@@ -164,236 +164,238 @@ WakeNet or Bypass 简介
 
 AFE 的输出音频为单通道数据。在语音识别场景，若WakeNet 开启的情况下，AFE 会输出有目标人声的单通道数据。在语音通话场景，将会输出信噪比更高的单通道数据。
 
-快速开始
---------
+.. only:: html
 
-定义 afe_handle
-~~~~~~~~~~~~~~~~~~
+    快速开始
+    --------
 
-``afe_handle`` 是用户后续调用 afe 接口的函数句柄。所以第一步需先获得 ``afe_handle``。
+    定义 afe_handle
+    ~~~~~~~~~~~~~~~~~~
 
--  语音识别
+    ``afe_handle`` 是用户后续调用 afe 接口的函数句柄。所以第一步需先获得 ``afe_handle``。
+
+    -  语音识别
+
+        ::
+
+            esp_afe_sr_iface_t *afe_handle = &ESP_AFE_SR_HANDLE;
+
+    -  语音通话
+
+        ::
+
+            esp_afe_sr_iface_t *afe_handle = &ESP_AFE_VC_HANDLE;
+
+    配置 afe
+    ~~~~~~~~~~~
+
+    获取 afe 的配置：
 
     ::
 
-        esp_afe_sr_iface_t *afe_handle = &ESP_AFE_SR_HANDLE;
+        afe_config_t afe_config = AFE_CONFIG_DEFAULT();
 
--  语音通话
+    可调整 ``afe_config`` 中各算法模块的使能及其相应参数:
 
     ::
 
-        esp_afe_sr_iface_t *afe_handle = &ESP_AFE_VC_HANDLE;
+        #define AFE_CONFIG_DEFAULT() { \
+            .aec_init = true, \
+            .se_init = true, \
+            .vad_init = true, \
+            .wakenet_init = true, \
+            .voice_communication_init = false, \
+            .voice_communication_agc_init = false, \
+            .voice_communication_agc_gain = 15, \
+            .vad_mode = VAD_MODE_3, \
+            .wakenet_model_name = NULL, \
+            .wakenet_mode = DET_MODE_2CH_90, \
+            .afe_mode = SR_MODE_LOW_COST, \
+            .afe_perferred_core = 0, \
+            .afe_perferred_priority = 5, \
+            .afe_ringbuf_size = 50, \
+            .memory_alloc_mode = AFE_MEMORY_ALLOC_MORE_PSRAM, \
+            .agc_mode = AFE_MN_PEAK_AGC_MODE_2, \
+            .pcm_config.total_ch_num = 3, \
+            .pcm_config.mic_num = 2, \
+            .pcm_config.ref_num = 1, \
+        }
 
-配置 afe
-~~~~~~~~~~~
+    -  aec_init: AEC 算法是否使能。
 
-获取 afe 的配置：
+    -  se_init: BSS/NS 算法是否使能。
 
-::
+    -  vad_init: VAD 是否使能 ( 仅可在语音识别场景中使用 )
 
-    afe_config_t afe_config = AFE_CONFIG_DEFAULT();
+    -  wakenet_init: 唤醒是否使能。
 
-可调整 ``afe_config`` 中各算法模块的使能及其相应参数:
+    -  voice_communication_init: 语音通话是否使能。与 wakenet_init
+        不能同时使能。
 
-::
+    -  voice_communication_agc_init: 语音通话中AGC是否使能。
 
-    #define AFE_CONFIG_DEFAULT() { \
-        .aec_init = true, \
-        .se_init = true, \
-        .vad_init = true, \
-        .wakenet_init = true, \
-        .voice_communication_init = false, \
-        .voice_communication_agc_init = false, \
-        .voice_communication_agc_gain = 15, \
-        .vad_mode = VAD_MODE_3, \
-        .wakenet_model_name = NULL, \
-        .wakenet_mode = DET_MODE_2CH_90, \
-        .afe_mode = SR_MODE_LOW_COST, \
-        .afe_perferred_core = 0, \
-        .afe_perferred_priority = 5, \
-        .afe_ringbuf_size = 50, \
-        .memory_alloc_mode = AFE_MEMORY_ALLOC_MORE_PSRAM, \
-        .agc_mode = AFE_MN_PEAK_AGC_MODE_2, \
-        .pcm_config.total_ch_num = 3, \
-        .pcm_config.mic_num = 2, \
-        .pcm_config.ref_num = 1, \
-    }
+    -  voice_communication_agc_gain: AGC的增益值，单位为dB。
 
--  aec_init: AEC 算法是否使能。
+    -  vad_mode: VAD 检测的操作模式，越大越激进。
 
--  se_init: BSS/NS 算法是否使能。
+    -  wakenet_model_name: 宏 ``AFE_CONFIG_DEFAULT()``  中该值默认为NULL。使用 ``idf.py menuconfig`` 选择了相应的唤醒模型后，在调用 ``afe_handle->create_from_config`` 之前，需给该处赋值具体的模型名字，类型为字符串形式。唤醒模型的具体说明，详见： `flash_model <../flash_model/README_cn.md>`__ (注意：示例代码中，使用了 esp_srmodel_filter() 获取模型名字，若 menuconfig 中选择了多个模型共存，该函数将会随机返回一个模型名字)
 
--  vad_init: VAD 是否使能 ( 仅可在语音识别场景中使用 )
+    -  wakenet_mode: 唤醒的模式。对应为多少通道的唤醒，根据mic通道的数量选择
 
--  wakenet_init: 唤醒是否使能。
+    -  afe_mode: 乐鑫 AFE 目前支持 2 种工作模式，分别为：SR_MODE_LOW_COST,SR_MODE_HIGH_PERF。详细可见 afe_sr_mode_t 枚举。
 
--  voice_communication_init: 语音通话是否使能。与 wakenet_init
-    不能同时使能。
+        -  SR_MODE_LOW_COST: 量化版本，占用资源较少。
 
--  voice_communication_agc_init: 语音通话中AGC是否使能。
+        -  SR_MODE_HIGH_PERF: 非量化版本，占用资源较多。
 
--  voice_communication_agc_gain: AGC的增益值，单位为dB。
+        **ESP32 芯片，只支持模式 SR_MODE_HIGH_PERF; ESP32S3 芯片，两种模式均支持**
 
--  vad_mode: VAD 检测的操作模式，越大越激进。
+    -  afe_perferred_core: AFE 内部 BSS/NS/MISO 算法，运行在哪个 CPU 核。
 
--  wakenet_model_name: 宏 ``AFE_CONFIG_DEFAULT()``  中该值默认为NULL。使用 ``idf.py menuconfig`` 选择了相应的唤醒模型后，在调用 ``afe_handle->create_from_config`` 之前，需给该处赋值具体的模型名字，类型为字符串形式。唤醒模型的具体说明，详见： `flash_model <../flash_model/README_cn.md>`__ (注意：示例代码中，使用了 esp_srmodel_filter() 获取模型名字，若 menuconfig 中选择了多个模型共存，该函数将会随机返回一个模型名字)
+    -  afe_perferred_priority: AFE 内部 BSS/NS/MISO 算法，运行的task优先级。
 
--  wakenet_mode: 唤醒的模式。对应为多少通道的唤醒，根据mic通道的数量选择
+    -  afe_ringbuf_size: 内部 ringbuf 大小的配置。
 
--  afe_mode: 乐鑫 AFE 目前支持 2 种工作模式，分别为：SR_MODE_LOW_COST,SR_MODE_HIGH_PERF。详细可见 afe_sr_mode_t 枚举。
+    -  memory_alloc_mode: 内存分配的模式。可配置三个值：
 
-    -  SR_MODE_LOW_COST: 量化版本，占用资源较少。
+        -  AFE_MEMORY_ALLOC_MORE_INTERNAL: 更多的从内部ram分配。
 
-    -  SR_MODE_HIGH_PERF: 非量化版本，占用资源较多。
+        -  AFE_MEMORY_ALLOC_INTERNAL_PSRAM_BALANCE: 部分从内部ram分配。
 
-      **ESP32 芯片，只支持模式 SR_MODE_HIGH_PERF; ESP32S3 芯片，两种模式均支持**
+        -  AFE_MEMORY_ALLOC_MORE_PSRAM: 绝大部分从外部psram分配
 
--  afe_perferred_core: AFE 内部 BSS/NS/MISO 算法，运行在哪个 CPU 核。
+    -  agc_mode: 将音频线性放大的 level 配置，该配置在语音识别场景下起作用，并且在唤醒使能时才生效。可配置四个值：
 
--  afe_perferred_priority: AFE 内部 BSS/NS/MISO 算法，运行的task优先级。
+        -  AFE_MN_PEAK_AGC_MODE_1: 线性放大喂给后续multinet的音频，峰值处为 -5dB。
 
--  afe_ringbuf_size: 内部 ringbuf 大小的配置。
+        -  AFE_MN_PEAK_AGC_MODE_2: 线性放大喂给后续multinet的音频，峰值处为 -4dB。
 
--  memory_alloc_mode: 内存分配的模式。可配置三个值：
+        -  AFE_MN_PEAK_AGC_MODE_3: 线性放大喂给后续multinet的音频，峰值处为 -3dB。
 
-    -  AFE_MEMORY_ALLOC_MORE_INTERNAL: 更多的从内部ram分配。
+        -  AFE_MN_PEAK_NO_AGC: 不做线性放大
 
-    -  AFE_MEMORY_ALLOC_INTERNAL_PSRAM_BALANCE: 部分从内部ram分配。
+    -  pcm_config: 根据 ``afe->feed()`` 喂入的音频结构进行配置，该结构体有三个成员变量需要配置：
 
-    -  AFE_MEMORY_ALLOC_MORE_PSRAM: 绝大部分从外部psram分配
+        -  total_ch_num: 音频总的通道数，total_ch_num = mic_num + ref_num。
 
--  agc_mode: 将音频线性放大的 level 配置，该配置在语音识别场景下起作用，并且在唤醒使能时才生效。可配置四个值：
+        -  mic_num: 音频的麦克风通道数。目前仅支持配置为 1 或 2。
 
-    -  AFE_MN_PEAK_AGC_MODE_1: 线性放大喂给后续multinet的音频，峰值处为 -5dB。
+        -  ref_num: 音频的参考回路通道数，目前仅支持配置为 0 或 1。
 
-    -  AFE_MN_PEAK_AGC_MODE_2: 线性放大喂给后续multinet的音频，峰值处为 -4dB。
+    创建 afe_data
+    ~~~~~~~~~~~~~~~~
 
-    -  AFE_MN_PEAK_AGC_MODE_3: 线性放大喂给后续multinet的音频，峰值处为 -3dB。
+    用户使用 ``afe_handle->create_from_config(&afe_config)`` 函数来获得数据句柄，这将会在afe内部使用，传入的参数即为上面第2步中获得的配置。
 
-    -  AFE_MN_PEAK_NO_AGC: 不做线性放大
+    ::
 
--  pcm_config: 根据 ``afe->feed()`` 喂入的音频结构进行配置，该结构体有三个成员变量需要配置：
+        /**
+        * @brief Function to initialze a AFE_SR instance
+        * 
+        * @param afe_config        The config of AFE_SR
+        * @returns Handle to the AFE_SR data
+        */
+    typedef esp_afe_sr_data_t* (*esp_afe_sr_iface_op_create_from_config_t)(afe_config_t *afe_config);
 
-    -  total_ch_num: 音频总的通道数，total_ch_num = mic_num + ref_num。
+    feed 音频数据
+    ~~~~~~~~~~~~~~~~
 
-    -  mic_num: 音频的麦克风通道数。目前仅支持配置为 1 或 2。
+    在初始化 AFE 完成后，用户需要将音频数据使用 ``afe_handle->feed()`` 函数输入到 AFE 中进行处理。
 
-    -  ref_num: 音频的参考回路通道数，目前仅支持配置为 0 或 1。
+    输入的音频大小和排布格式可以参考 **输入音频** 这一步骤。
 
-创建 afe_data
-~~~~~~~~~~~~~~~~
+    ::
 
-用户使用 ``afe_handle->create_from_config(&afe_config)`` 函数来获得数据句柄，这将会在afe内部使用，传入的参数即为上面第2步中获得的配置。
+        /**
+        * @brief Feed samples of an audio stream to the AFE_SR
+        *
+        * @Warning  The input data should be arranged in the format of channel interleaving.
+        *           The last channel is reference signal if it has reference data.
+        *
+        * @param afe   The AFE_SR object to query
+        * 
+        * @param in    The input microphone signal, only support signed 16-bit @ 16 KHZ. The frame size can be queried by the 
+        *              `get_feed_chunksize`.
+        * @return      The size of input
+        */
+    typedef int (*esp_afe_sr_iface_op_feed_t)(esp_afe_sr_data_t *afe, const int16_t* in);
 
-::
+    获取音频通道数：
 
-    /**
-    * @brief Function to initialze a AFE_SR instance
-    * 
-    * @param afe_config        The config of AFE_SR
-    * @returns Handle to the AFE_SR data
-    */
-   typedef esp_afe_sr_data_t* (*esp_afe_sr_iface_op_create_from_config_t)(afe_config_t *afe_config);
+    使用 ``afe_handle->get_total_channel_num()`` 函数可以获取需要传入 ``afe_handle->feed()`` 函数的总数据通道数。其返回值等于AFE_CONFIG_DEFAULT()中配置的 ``pcm_config.mic_num + pcm_config.ref_num``
 
-feed 音频数据
-~~~~~~~~~~~~~~~~
+    ::
 
-在初始化 AFE 完成后，用户需要将音频数据使用 ``afe_handle->feed()`` 函数输入到 AFE 中进行处理。
+        /**
+        * @brief Get the total channel number which be config
+        * 
+        * @param afe   The AFE_SR object to query
+        * @return      The amount of total channels
+        */
+    typedef int (*esp_afe_sr_iface_op_get_total_channel_num_t)(esp_afe_sr_data_t *afe);
 
-输入的音频大小和排布格式可以参考 **输入音频** 这一步骤。
+    fetch 音频数据
+    ~~~~~~~~~~~~~~
 
-::
+    用户调用 ``afe_handle->fetch()`` 函数可以获取处理完成的单通道音频以及相关处理信息。
 
-    /**
-    * @brief Feed samples of an audio stream to the AFE_SR
-    *
-    * @Warning  The input data should be arranged in the format of channel interleaving.
-    *           The last channel is reference signal if it has reference data.
-    *
-    * @param afe   The AFE_SR object to query
-    * 
-    * @param in    The input microphone signal, only support signed 16-bit @ 16 KHZ. The frame size can be queried by the 
-    *              `get_feed_chunksize`.
-    * @return      The size of input
-    */
-   typedef int (*esp_afe_sr_iface_op_feed_t)(esp_afe_sr_data_t *afe, const int16_t* in);
+    fetch 的数据采样点数目（采样点数据类型为 int16）可以通过 ``afe_handle->get_fetch_chunksize`` 获取。
 
-获取音频通道数：
+    ::
 
-使用 ``afe_handle->get_total_channel_num()`` 函数可以获取需要传入 ``afe_handle->feed()`` 函数的总数据通道数。其返回值等于AFE_CONFIG_DEFAULT()中配置的 ``pcm_config.mic_num + pcm_config.ref_num``
+        /**
+        * @brief Get the amount of each channel samples per frame that need to be passed to the function
+        *
+        * Every speech enhancement AFE_SR processes a certain number of samples at the same time. This function
+        * can be used to query that amount. Note that the returned amount is in 16-bit samples, not in bytes.
+        *
+        * @param afe The AFE_SR object to query
+        * @return The amount of samples to feed the fetch function
+        */
+    typedef int (*esp_afe_sr_iface_op_get_samp_chunksize_t)(esp_afe_sr_data_t *afe);
 
-::
+    ``afe_handle->fetch()`` 的函数声明如下：
 
-    /**
-    * @brief Get the total channel number which be config
-    * 
-    * @param afe   The AFE_SR object to query
-    * @return      The amount of total channels
-    */
-   typedef int (*esp_afe_sr_iface_op_get_total_channel_num_t)(esp_afe_sr_data_t *afe);
+    ::
 
-fetch 音频数据
-~~~~~~~~~~~~~~
+        /**
+        * @brief fetch enhanced samples of an audio stream from the AFE_SR
+        *
+        * @Warning  The output is single channel data, no matter how many channels the input is.
+        *
+        * @param afe   The AFE_SR object to query
+        * @return      The result of output, please refer to the definition of `afe_fetch_result_t`. (The frame size of output audio can be queried by the `get_fetch_chunksize`.)
+        */
+    typedef afe_fetch_result_t* (*esp_afe_sr_iface_op_fetch_t)(esp_afe_sr_data_t *afe);
 
-用户调用 ``afe_handle->fetch()`` 函数可以获取处理完成的单通道音频以及相关处理信息。
+    其返回值为结构体指针，结构体定义如下：
 
-fetch 的数据采样点数目（采样点数据类型为 int16）可以通过 ``afe_handle->get_fetch_chunksize`` 获取。
+    ::
 
-::
-
-    /**
-    * @brief Get the amount of each channel samples per frame that need to be passed to the function
-    *
-    * Every speech enhancement AFE_SR processes a certain number of samples at the same time. This function
-    * can be used to query that amount. Note that the returned amount is in 16-bit samples, not in bytes.
-    *
-    * @param afe The AFE_SR object to query
-    * @return The amount of samples to feed the fetch function
-    */
-   typedef int (*esp_afe_sr_iface_op_get_samp_chunksize_t)(esp_afe_sr_data_t *afe);
-
-``afe_handle->fetch()`` 的函数声明如下：
-
-::
-
-    /**
-    * @brief fetch enhanced samples of an audio stream from the AFE_SR
-    *
-    * @Warning  The output is single channel data, no matter how many channels the input is.
-    *
-    * @param afe   The AFE_SR object to query
-    * @return      The result of output, please refer to the definition of `afe_fetch_result_t`. (The frame size of output audio can be queried by the `get_fetch_chunksize`.)
-    */
-   typedef afe_fetch_result_t* (*esp_afe_sr_iface_op_fetch_t)(esp_afe_sr_data_t *afe);
-
-其返回值为结构体指针，结构体定义如下：
-
-::
-
-    /**
-    * @brief The result of fetch function
-    */
-    typedef struct afe_fetch_result_t
-    {
-       int16_t *data;                          // the data of audio.
-       int data_size;                          // the size of data. The unit is byte.
-       int wakeup_state;                       // the value is wakenet_state_t
-       int wake_word_index;                    // if the wake word is detected. It will store the wake word index which start from 1.
-       int vad_state;                          // the value is afe_vad_state_t
-       int trigger_channel_id;                 // the channel index of output
-       int wake_word_length;                   // the length of wake word. It's unit is the number of samples.
-       int ret_value;                          // the return state of fetch function
-       void* reserved;                         // reserved for future use
-    } afe_fetch_result_t;
-
-WakeNet 使用
-~~~~~~~~~~~~~
-
-当用户在唤醒后需要进行其他操作，比如离线或在线语音识别，这时候可以暂停 WakeNet 的运行，从而减轻 CPU 的资源消耗。
-
-用户可以调用 ``afe_handle->disable_wakenet(afe_data)`` 来停止 WakeNet。当后续应用结束后又可以调用 ``afe_handle->enable_wakenet(afe_data)`` 来开启 WakeNet。
-
-另外，ESP32S3 芯片，支持唤醒词切换。(注： ESP32 芯片只支持一个唤醒词，不支持切换)。在初始化 AFE 完成后，ESP32S3 芯片可通过 ``set_wakenet()`` 函数切换唤醒词。例如, ``afe_handle->set_wakenet(afe_data, “wn9_hilexin”)`` 切换到“Hi Lexin”唤醒词。具体如何配置多个唤醒词，详见： `flash_model <../flash_model/README_CN.md>`__
+        /**
+        * @brief The result of fetch function
+        */
+        typedef struct afe_fetch_result_t
+        {
+        int16_t *data;                          // the data of audio.
+        int data_size;                          // the size of data. The unit is byte.
+        int wakeup_state;                       // the value is wakenet_state_t
+        int wake_word_index;                    // if the wake word is detected. It will store the wake word index which start from 1.
+        int vad_state;                          // the value is afe_vad_state_t
+        int trigger_channel_id;                 // the channel index of output
+        int wake_word_length;                   // the length of wake word. It's unit is the number of samples.
+        int ret_value;                          // the return state of fetch function
+        void* reserved;                         // reserved for future use
+        } afe_fetch_result_t;
+
+    WakeNet 使用
+    ~~~~~~~~~~~~~
+
+    当用户在唤醒后需要进行其他操作，比如离线或在线语音识别，这时候可以暂停 WakeNet 的运行，从而减轻 CPU 的资源消耗。
+
+    用户可以调用 ``afe_handle->disable_wakenet(afe_data)`` 来停止 WakeNet。当后续应用结束后又可以调用 ``afe_handle->enable_wakenet(afe_data)`` 来开启 WakeNet。
+
+    另外，ESP32S3 芯片，支持唤醒词切换。(注： ESP32 芯片只支持一个唤醒词，不支持切换)。在初始化 AFE 完成后，ESP32S3 芯片可通过 ``set_wakenet()`` 函数切换唤醒词。例如, ``afe_handle->set_wakenet(afe_data, “wn9_hilexin”)`` 切换到“Hi Lexin”唤醒词。具体如何配置多个唤醒词，详见： `flash_model <../flash_model/README_CN.md>`__
 
 AEC 使用
 ~~~~~~~~
diff --git a/docs/zh_CN/flash_model/README.rst b/docs/zh_CN/flash_model/README.rst
index 87e0488..a8f35be 100644
--- a/docs/zh_CN/flash_model/README.rst
+++ b/docs/zh_CN/flash_model/README.rst
@@ -153,18 +153,20 @@ ESP32S3 支持：
 
 -  自定义路径 如果用户想将模型放置于指定文件夹，可以自己修改 ``get_model_base_path()`` 函数，位于 ``ESP-SR_PATH/model/model_path.c``。 比如，指定文件夹为 SD 卡目录中的 ``espmodel``, 则可以修改该函数为：
 
-    ::
+.. only:: html
 
-        char *get_model_base_path(void)
-        {
-        #if defined CONFIG_MODEL_IN_SDCARD
-            return "sdcard/espmodel";
-        #elif defined CONFIG_MODEL_IN_SPIFFS
-            return "srmodel";
-        #else
-            return NULL;
-        #endif
-        }
+        ::
+
+            char *get_model_base_path(void)
+            {
+            #if defined CONFIG_MODEL_IN_SDCARD
+                return "sdcard/espmodel";
+            #elif defined CONFIG_MODEL_IN_SPIFFS
+                return "srmodel";
+            #else
+                return NULL;
+            #endif
+            }
 
 -  初始化 SD 卡
 
@@ -172,39 +174,41 @@ ESP32S3 支持：
 
 完成以上操作后，便可以进行工程的烧录。
 
-代码中模型初始化与使用
-^^^^^^^^^^^^^^^^^^^^^^
+.. only:: html
 
-::
+    代码中模型初始化与使用
+    ^^^^^^^^^^^^^^^^^^^^^^
 
-        //
-        // step1: initialize spiffs and return models in spiffs
-        // 
-        srmodel_list_t *models = esp_srmodel_init();
+    ::
 
-        //
-        // step2: select the specific model by keywords
-        //
-        char *wn_name = esp_srmodel_filter(models, ESP_WN_PREFIX, NULL); // select wakenet model
-        char *nm_name = esp_srmodel_filter(models, ESP_MN_PREFIX, NULL); // select multinet model
-        char *alexa_wn_name  = esp_srmodel_filter(models, ESP_WN_PREFIX, "alexa"); // select wakenet with "alexa" wake word.
-        char *en_mn_name  = esp_srmodel_filter(models, ESP_MN_PREFIX, ESP_MN_ENGLISH); // select english multinet model
-        char *cn_mn_name  = esp_srmodel_filter(models, ESP_MN_PREFIX, ESP_MN_CHINESE); // select english multinet model
+            //
+            // step1: initialize spiffs and return models in spiffs
+            // 
+            srmodel_list_t *models = esp_srmodel_init();
 
-        // It also works if you use the model name directly in your code.
-        char *my_wn_name = "wn9_hilexin"  
-        // we recommend you to check that it is loaded correctly
-        if (!esp_srmodel_exists(models, my_wn_name))
-            printf("%s can not be loaded correctly\n")
+            //
+            // step2: select the specific model by keywords
+            //
+            char *wn_name = esp_srmodel_filter(models, ESP_WN_PREFIX, NULL); // select wakenet model
+            char *nm_name = esp_srmodel_filter(models, ESP_MN_PREFIX, NULL); // select multinet model
+            char *alexa_wn_name  = esp_srmodel_filter(models, ESP_WN_PREFIX, "alexa"); // select wakenet with "alexa" wake word.
+            char *en_mn_name  = esp_srmodel_filter(models, ESP_MN_PREFIX, ESP_MN_ENGLISH); // select english multinet model
+            char *cn_mn_name  = esp_srmodel_filter(models, ESP_MN_PREFIX, ESP_MN_CHINESE); // select english multinet model
 
-        //
-        // step3: initialize model
-        //
-        esp_wn_iface_t *wakenet = esp_wn_handle_from_name(wn_name);
-        model_iface_data_t *wn_model_data = wakenet->create(wn_name, DET_MODE_2CH_90);
+            // It also works if you use the model name directly in your code.
+            char *my_wn_name = "wn9_hilexin"  
+            // we recommend you to check that it is loaded correctly
+            if (!esp_srmodel_exists(models, my_wn_name))
+                printf("%s can not be loaded correctly\n")
 
-        esp_mn_iface_t *multinet = esp_mn_handle_from_name(mn_name);
-        model_iface_data_t *mn_model_data = multinet->create(mn_name, 6000);
+            //
+            // step3: initialize model
+            //
+            esp_wn_iface_t *wakenet = esp_wn_handle_from_name(wn_name);
+            model_iface_data_t *wn_model_data = wakenet->create(wn_name, DET_MODE_2CH_90);
+
+            esp_mn_iface_t *multinet = esp_mn_handle_from_name(mn_name);
+            model_iface_data_t *mn_model_data = multinet->create(mn_name, 6000);
 
 .. |select wake wake| image:: ../../_static/wn_menu1.png
 .. |multi wake wake| image:: ../../_static/wn_menu2.png
diff --git a/docs/zh_CN/index.rst b/docs/zh_CN/index.rst
index 4a7d8fe..62d4d32 100644
--- a/docs/zh_CN/index.rst
+++ b/docs/zh_CN/index.rst
@@ -24,7 +24,6 @@ ESP-SR 用户指南
     唤醒词模型 <wake_word_engine/README>
     定制化唤醒词 <wake_word_engine/ESP_Wake_Words_Customization>
     语音指令 <speech_command_recognition/README>
-    声学算法介绍 <acoustic_algorithm/README>
     模型加载方式 <flash_model/README>
     麦克风设计指南 <audio_front_end/Espressif_Microphone_Design_Guidelines>
     测试报告 <test_report/README>
diff --git a/docs/zh_CN/speech_command_recognition/README.rst b/docs/zh_CN/speech_command_recognition/README.rst
index 2806758..3e0c47e 100644
--- a/docs/zh_CN/speech_command_recognition/README.rst
+++ b/docs/zh_CN/speech_command_recognition/README.rst
@@ -85,6 +85,11 @@ MultiNet 对命令词自定义方法没有限制，用户可以通过任意方
 
    **并且我们也提供相应的工具，供用户将汉字转换为拼音，详细可见：** `英文转音素工具 <../../tool/multinet_g2p.py>`__。
 
+.. only:: latex
+
+    .. figure:: ../../_static/QR_multinet_g2p.png
+        :alt: menuconfig_add_speech_commands
+
 离线设置命令词
 ^^^^^^^^^^^^^^^
 
diff --git a/docs/zh_CN/test_report/README.rst b/docs/zh_CN/test_report/README.rst
index 3847f8e..2781e21 100644
--- a/docs/zh_CN/test_report/README.rst
+++ b/docs/zh_CN/test_report/README.rst
@@ -138,18 +138,21 @@
 唤醒率测试
 -----------
 
-+----------------+------------+---------------+-----------+-------------+-------------+--------+--------+
-|     测试项     |  环境噪声  |   噪声指标    | 信噪比SNR |    角度     |    距离     | 唤醒率 | 识别率 |
-+================+============+===============+===========+=============+=============+========+========+
-| 本地唤醒率测试 | 安静       | - 人声：59dBA | NA        | - 人声：90° | - 人声：3米 | 99%    | 91.5%  |
-|                |            | - 噪声：NA    |           | - 噪声45°   | - 噪声：2米 |        |        |
-|                +------------+---------------+-----------+             +             +--------+--------+
-|                | 白噪声     | - 人声：59dBA | ≥4dBA     |             |             | 99%    | 78.25% |
-|                |            | - 噪声：55dBA |           |             |             |        |        |
-|                +------------+---------------+-----------+             +             +--------+--------+
-|                | 人声类噪声 | - 人声：59dBA | ≥4dBA     |             |             | 99%    | 82.77% |
-|                |            | - 噪声55dBA   |           |             |             |        |        |
-+----------------+------------+---------------+-----------+-------------+-------------+--------+--------+
++----------------+------------+-------------+-----------+-----------+-----------+--------+--------+
+| 测试项         | 环境噪声   | 噪声指标    | 信噪比SNR | 角度      | 距离      | 唤醒率 | 识别率 |
++================+============+=============+===========+===========+===========+========+========+
+| 本地唤醒率测试 | 安静       | 人声：59dBA | NA        | 人声：90° | 人声：3米 | 99%    | 91.5%  |
+|                |            |             |           |           |           |        |        |
+|                |            | 噪声：NA    |           | 噪声：45° | 噪声：2米 |        |        |
+|                +------------+-------------+-----------+           |           +--------+--------+
+|                | 白噪声     | 人声：59dBA | ≥4dBA     |           |           | 99%    | 78.25% |
+|                |            |             |           |           |           |        |        |
+|                |            | 噪声：55dBA |           |           |           |        |        |
+|                +------------+-------------+-----------+           |           +--------+--------+
+|                | 人声类噪声 | 人声：59dBA | ≥4dBA     |           |           | 99%    | 82.77% |
+|                |            |             |           |           |           |        |        |
+|                |            | 噪声：55dBA |           |           |           |        |        |
++----------------+------------+-------------+-----------+-----------+-----------+--------+--------+
 
 误唤醒测试
 -----------
@@ -168,11 +171,11 @@
 +----------------+----------+---------------+-----------+--------+--------------+
 |     测试项     | 环境噪声 |   噪声指标    | 信噪比SNR | 唤醒率 | 命令词识别率 |
 +================+==========+===============+===========+========+==============+
-| 唤醒打断率测试 | 音乐     | - 人声59dBA   | ≥-10dBA   | 100%   | 96%          |
-|                |          | - 噪声69dBA   |           |        |              |
+| 唤醒打断率测试 | 音乐     | 人声59dBA     | ≥ 10dBA   | 100%   | 96%          |
+|                |          | 噪声69dBA     |           |        |              |
 |                +----------+---------------+-----------+--------+--------------+
-|                | TTS      | - 人声：59dBA | ≥-10dBA   | 100%   | 96%          |
-|                |          | - 噪声：69dBA |           |        |              |
+|                | TTS      | 人声：59dBA   | ≥ 10dBA   | 100%   | 96%          |
+|                |          | 噪声：69dBA   |           |        |              |
 +----------------+----------+---------------+-----------+--------+--------------+
 
 响应时间测试
@@ -181,8 +184,8 @@
 +--------------+----------+---------------+------------+----------+
 |    测试项    | 环境噪声 |   噪声指标    | 信噪比 SNR | 响应时间 |
 +==============+==========+===============+============+==========+
-| 响应时间测试 | 安静     | - 人声：59dBA | NA         | <500 ms  |
-|              |          | - 噪声：NA    |            |          |
+| 响应时间测试 | 安静     | 人声：59dBA   | NA         | <500 ms  |
+|              |          | 噪声：NA      |            |          |
 +--------------+----------+---------------+------------+----------+
 
 .. figure:: ../../_static/test_response_time.png
diff --git a/docs/zh_CN/wake_word_engine/README.rst b/docs/zh_CN/wake_word_engine/README.rst
index 891758a..499fc05 100644
--- a/docs/zh_CN/wake_word_engine/README.rst
+++ b/docs/zh_CN/wake_word_engine/README.rst
@@ -24,6 +24,11 @@ WakeNet的流程图如下：
 -  Speech Features：
     我们使用 `MFCC <https://en.wikipedia.org/wiki/Mel-frequency_cepstrum>`__ 方法提取语音频谱特征。输入的音频文件采样率为16KHz，单声道，编码方式为signed 16-bit。每帧窗宽和步长均为30ms。
 
+.. only:: latex
+
+    .. figure:: ../../_static/QR_MFCC.png
+        :alt: overview
+
 -  Neural Network：
     神经网络结构已经更新到第9版，其中：
 
@@ -31,6 +36,11 @@ WakeNet的流程图如下：
     -  wakeNet5应用于ESP32芯片。
     -  wakeNet8和wakeNet9应用于ESP32S3芯片，模型基于　`Dilated Convolution <https://arxiv.org/pdf/1609.03499.pdf>`__ 结构。
 
+.. only:: latex
+
+    .. figure:: ../../_static/QR_Dilated_Convolution.png
+        :alt: overview
+
     注意，WakeNet5,WakeNet5X2 和 WakeNet5X3 的网络结构一致，但是 WakeNet5X2 和 WakeNet5X3 的参数比 WakeNet5 要多。请参考 `性能测试 <#性能测试>`__ 来获取更多细节。
 
 -  Keyword Trigger Method：
@@ -71,9 +81,7 @@ WakeNet使用
 
 -  WakeNet 运行
 
-    WakeNet 目前包含在语音前端算法
-    `AFE <../audio_front_end/README_CN.md>`__
-    中，默认为运行状态，并将识别结果通过 AFE fetch 接口返回。
+    WakeNet 目前包含在语音前端算法 `AFE <../audio_front_end/README_CN.md>`__中，默认为运行状态，并将识别结果通过 AFE fetch 接口返回。
 
     如果用户不需要初始化 WakeNet，请在 AFE 配置时选择：