mirror of
https://github.com/espressif/esp-sr.git
synced 2025-09-15 15:28:44 +08:00
| .. | ||
| README.rst | ||
Benchmark
==========
:link_to_translation:`zh_CN:[中文]`
AFE
---
Resource Consumption
~~~~~~~~~~~~~~~~~~~~
.. only:: esp32
+-----------------+-----------------+-----------------+-----------------+
| Algorithm Type | RAM | Average cpu | Frame Length |
| | | loading(compute | |
| | | with 2 cores) | |
+=================+=================+=================+=================+
| AEC(HIGH_PERF) | 114 KB | 11% | 32 ms |
+-----------------+-----------------+-----------------+-----------------+
| NS | 27 KB | 5% | 10 ms |
+-----------------+-----------------+-----------------+-----------------+
| AFE Layer | 73 KB | | |
+-----------------+-----------------+-----------------+-----------------+
.. only:: esp32s3
+--------------+------+-----------+---------------+------------+----------------+-----------------+
| Input Format | Type | Mode | Internal RAM | PSRAM | Feed Task CPU | Fetch Task CPU |
+==============+======+===========+===============+============+================+=================+
| MR | SR | LOW_COST | 72348 | 732932 | 8.4% | 14.9% |
+--------------+------+-----------+---------------+------------+----------------+-----------------+
| MR | SR | HIGH_PERF | 78016 | 734980 | 9.4% | 14.9% |
+--------------+------+-----------+---------------+------------+----------------+-----------------+
| MR | VC | LOW_COST | 50316 | 821564 | 60.0% | 8.1% |
+--------------+------+-----------+---------------+------------+----------------+-----------------+
| MR | VC | HIGH_PERF | 93668 | 824144 | 64.0% | 8.2% |
+--------------+------+-----------+---------------+------------+----------------+-----------------+
| MMR | SR | LOW_COST | 76684 | 1175148 | 36.6% | 30.2% |
+--------------+------+-----------+---------------+------------+----------------+-----------------+
| MMR | SR | HIGH_PERF | 99064 | 1174960 | 38.8% | 30.0% |
+--------------+------+-----------+---------------+------------+----------------+-----------------+
.. note::
Input Format:
- MR: one microphone channel and one playback channel
- MMR: two microphone channels and two playback channels
.. only:: esp32p4
+--------------+------+-----------+---------------+------------+-----------------+-----------------+
| Input Format | Type | Mode | Internal RAM | PSRAM | Feed Task CPU | Fetch Task CPU |
+==============+======+===========+===============+============+=================+=================+
| MR | SR | LOW_COST | 75404 | 751292 | 10.6% | 11.3% |
+--------------+------+-----------+---------------+------------+-----------------+-----------------+
| MR | SR | HIGH_PERF | 75128 | 751292 | 10.6% | 11.3% |
+--------------+------+-----------+---------------+------------+-----------------+-----------------+
| MR | VC | LOW_COST | 76192 | 841300 | 40.3% | 5.7% |
+--------------+------+-----------+---------------+------------+-----------------+-----------------+
| MR | VC | HIGH_PERF | 119536 | 843880 | 42.6% | 5.7% |
+--------------+------+-----------+---------------+------------+-----------------+-----------------+
| MMR | SR | LOW_COST | 79940 | 1202692 | 28.4% | 24.9% |
+--------------+------+-----------+---------------+------------+-----------------+-----------------+
| MMR | SR | HIGH_PERF | 79940 | 1202692 | 28.4% | 24.9% |
+--------------+------+-----------+---------------+------------+-----------------+-----------------+
.. note::
Input Format:
- MR one microphone channel and one playback channel
- MMR: two microphone channels and one playback channels
WakeNet
-------
.. _resource-occupancyesp32-1:
Resource Consumption
~~~~~~~~~~~~~~~~~~~~
.. only:: esp32
+-------------+-------------+-------------+-------------+-------------+
| Model Type | Parameter | RAM | Average | Frame |
| | Num | | Running | Length |
| | | | Time per | |
| | | | Frame | |
+=============+=============+=============+=============+=============+
| Quantised | 41 K | 15 KB | 5.5 ms | 30 ms |
| WakeNet5 | | | | |
+-------------+-------------+-------------+-------------+-------------+
| Quantised | 165 K | 20 KB | 10.5 ms | 30 ms |
| WakeNet5X2 | | | | |
+-------------+-------------+-------------+-------------+-------------+
| Quantised | 371 K | 24 KB | 18 ms | 30 ms |
| WakeNet5X3 | | | | |
+-------------+-------------+-------------+-------------+-------------+
.. _resource-occupancyesp32s3-1:
.. only:: esp32s3
+----------------+-------+---------+----------------+--------------+
| Model Type | RAM | PSRAM | Average | Frame Length |
| | | | Running Time | |
| | | | per Frame | |
+================+=======+=========+================+==============+
| Quantised | 50 KB | 1640 KB | 10.0 ms | 32 ms |
| WakeNet8 @ 2 | | | | |
| channel | | | | |
+----------------+-------+---------+----------------+--------------+
| Quantised | 16 KB | 324 KB | 3.0 ms | 32 ms |
| WakeNet9 @ 2 | | | | |
| channel | | | | |
+----------------+-------+---------+----------------+--------------+
| Quantised | 20 KB | 347 KB | 4.3 ms | 32 ms |
| WakeNet9 @ 3 | | | | |
| channel | | | | |
+----------------+-------+---------+----------------+--------------+
.. only:: esp32p4
+----------------+-------+---------+----------------+--------------+
| Model Type | RAM | PSRAM | Average | Frame Length |
| | | | Running Time | |
| | | | per Frame | |
+================+=======+=========+================+==============+
| Quantised | 16 KB | 324 KB | 2.6 ms | 32 ms |
| WakeNet9 @ 2 | | | | |
| channel | | | | |
+----------------+-------+---------+----------------+--------------+
| Quantised | 20 KB | 347 KB | 3.1 ms | 32 ms |
| WakeNet9 @ 3 | | | | |
| channel | | | | |
+----------------+-------+---------+----------------+--------------+
Performance Test
~~~~~~~~~~~~~~~~
+-------------+-------------+-------------+-------------+-------------+
| Distance | Quiet | Stationary | Speech | AEC |
| | | Noise (SNR | Noise (SNR | I |
| | | = 4 dB) | = 4 dB) | nterruption |
| | | | | (-10 dB) |
+=============+=============+=============+=============+=============+
| 1 m | 98% | 96% | 94% | 96% |
+-------------+-------------+-------------+-------------+-------------+
| 3 m | 98% | 96% | 94% | 94% |
+-------------+-------------+-------------+-------------+-------------+
False triggering rate: once in 12 hours
.. note::
In this test, we used ESP32-S3-Korvo V4.0 development board and WakeNet9(Alexa) model.
MultiNet
--------
.. _resource-occupancyesp32-2:
Resource Consumption
~~~~~~~~~~~~~~~~~~~~
.. only:: esp32
+-------------+-------------+-------------+-------------+-------------+
| Model Type | Internal | PSRAM | Average | Frame |
| | RAM | | Running | Length |
| | | | Time per | |
| | | | Frame | |
+=============+=============+=============+=============+=============+
| MultiNet 2 | 13.3 KB | 9KB | 38 ms | 30 ms |
+-------------+-------------+-------------+-------------+-------------+
.. _resource-occupancyesp32s3-2:
.. only:: esp32s3
+-------------+-------------+-------------+-------------+-------------+
| Model Type | Internal | PSRAM | Average | Frame |
| | RAM | | Running | Length |
| | | | Time per | |
| | | | Frame | |
+=============+=============+=============+=============+=============+
| MultiNet 4 | 16.8KB | 1866 KB | 18 ms | 32 ms |
+-------------+-------------+-------------+-------------+-------------+
| MultiNet 4 | 10.5 KB | 1009 KB | 11 ms | 32 ms |
| Q8 | | | | |
+-------------+-------------+-------------+-------------+-------------+
| MultiNet 5 | 16 KB | 2310 KB | 12 ms | 32 ms |
| Q8 | | | | |
+-------------+-------------+-------------+-------------+-------------+
| MultiNet 6 | 32 KB | 4100 KB | 12 ms | 32 ms |
+-------------+-------------+-------------+-------------+-------------+
| MultiNet 7 | 18 KB | 2920 KB | 11 ms | 32 ms |
+-------------+-------------+-------------+-------------+-------------+
.. only:: esp32p4
+-------------+-------------+-------------+-------------+-------------+
| Model Type | Internal | PSRAM | Average | Frame |
| | RAM | | Running | Length |
| | | | Time per | |
| | | | Frame | |
+=============+=============+=============+=============+=============+
| MultiNet 7 | 18 KB | 2920 KB | 8 ms | 32 ms |
+-------------+-------------+-------------+-------------+-------------+
Word Error Rate Performance Test
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+-------------+-------------+-------------+
| Model | librispeech | librispeech |
| Type | test-clean | test-other |
+=============+=============+=============+
| MultiNet5-en| 16.5% | 41.4% |
+-------------+-------------+-------------+
| MultiNet6-en| 9.0% | 21.3% |
+-------------+-------------+-------------+
| MultiNet7-en| 8.5% | 21.3% |
+-------------+-------------+-------------+
Speech Commands Performance Test
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+-----------+-----------+----------+------------+-------------+
| Model | Distance | Quiet | Stationary | Speech |
| Type | | | Noise | Noise |
| | | | (SNR=5~10dB| (SNR=5~10dB |
| | | | dB) | dB) |
+===========+===========+==========+============+=============+
| MultiNet | 3 m | 95.4% | 85.9% | 82.7% |
| 5_en | | | | |
+-----------+-----------+----------+------------+-------------+
| MultiNet | 3 m | 96.8% | 87.9% | 85.5% |
| 6_en | | | | |
+-----------+-----------+----------+------------+-------------+
| MultiNet | 3 m | 97.2% | 92.3% | 90.6% |
| 7_en | | | | |
+-----------+-----------+----------+------------+-------------+
TTS
---
Resource Consumption
~~~~~~~~~~~~~~~~~~~~
Flash image size: 2.2 MB
RAM runtime: 20 KB
Performance Test
~~~~~~~~~~~~~~~~
CPU loading test (ESP32 @240 MHz):
+------------------------------+------+------+------+------+------+------+
| Speech Rate | 0 | 1 | 2 | 3 | 4 | 5 |
+==============================+======+======+======+======+======+======+
| Times faster than real time | 4.5 | 3.2 | 2.9 | 2.5 | 2.2 | 1.8 |
+------------------------------+------+------+------+------+------+------+