Benchmark ========== :link_to_translation:`zh_CN:[中文]` AFE --- Resource Consumption ~~~~~~~~~~~~~~~~~~~~ .. only:: esp32 +-----------------+-----------------+-----------------+-----------------+ | Algorithm Type | RAM | Average cpu | Frame Length | | | | loading(compute | | | | | with 2 cores) | | +=================+=================+=================+=================+ | AEC(HIGH_PERF) | 114 KB | 11% | 32 ms | +-----------------+-----------------+-----------------+-----------------+ | NS | 27 KB | 5% | 10 ms | +-----------------+-----------------+-----------------+-----------------+ | AFE Layer | 73 KB | | | +-----------------+-----------------+-----------------+-----------------+ .. only:: esp32s3 +--------------+------+-----------+---------------+------------+----------------+-----------------+ | Input Format | Type | Mode | Internal RAM | PSRAM | Feed Task CPU | Fetch Task CPU | +==============+======+===========+===============+============+================+=================+ | MR | SR | LOW_COST | 72348 | 732932 | 8.4% | 14.9% | +--------------+------+-----------+---------------+------------+----------------+-----------------+ | MR | SR | HIGH_PERF | 78016 | 734980 | 9.4% | 14.9% | +--------------+------+-----------+---------------+------------+----------------+-----------------+ | MR | VC | LOW_COST | 50316 | 821564 | 60.0% | 8.1% | +--------------+------+-----------+---------------+------------+----------------+-----------------+ | MR | VC | HIGH_PERF | 93668 | 824144 | 64.0% | 8.2% | +--------------+------+-----------+---------------+------------+----------------+-----------------+ | MMR | SR | LOW_COST | 76684 | 1175148 | 36.6% | 30.2% | +--------------+------+-----------+---------------+------------+----------------+-----------------+ | MMR | SR | HIGH_PERF | 99064 | 1174960 | 38.8% | 30.0% | +--------------+------+-----------+---------------+------------+----------------+-----------------+ .. note:: Input Format: - MR: one microphone channel and one playback channel - MMR: two microphone channels and two playback channels .. only:: esp32p4 +--------------+------+-----------+---------------+------------+-----------------+-----------------+ | Input Format | Type | Mode | Internal RAM | PSRAM | Feed Task CPU | Fetch Task CPU | +==============+======+===========+===============+============+=================+=================+ | MR | SR | LOW_COST | 75404 | 751292 | 10.6% | 11.3% | +--------------+------+-----------+---------------+------------+-----------------+-----------------+ | MR | SR | HIGH_PERF | 75128 | 751292 | 10.6% | 11.3% | +--------------+------+-----------+---------------+------------+-----------------+-----------------+ | MR | VC | LOW_COST | 76192 | 841300 | 40.3% | 5.7% | +--------------+------+-----------+---------------+------------+-----------------+-----------------+ | MR | VC | HIGH_PERF | 119536 | 843880 | 42.6% | 5.7% | +--------------+------+-----------+---------------+------------+-----------------+-----------------+ | MMR | SR | LOW_COST | 79940 | 1202692 | 28.4% | 24.9% | +--------------+------+-----------+---------------+------------+-----------------+-----------------+ | MMR | SR | HIGH_PERF | 79940 | 1202692 | 28.4% | 24.9% | +--------------+------+-----------+---------------+------------+-----------------+-----------------+ .. note:: Input Format: - MR one microphone channel and one playback channel - MMR: two microphone channels and one playback channels WakeNet ------- .. _resource-occupancyesp32-1: Resource Consumption ~~~~~~~~~~~~~~~~~~~~ .. only:: esp32 +-------------+-------------+-------------+-------------+-------------+ | Model Type | Parameter | RAM | Average | Frame | | | Num | | Running | Length | | | | | Time per | | | | | | Frame | | +=============+=============+=============+=============+=============+ | Quantised | 41 K | 15 KB | 5.5 ms | 30 ms | | WakeNet5 | | | | | +-------------+-------------+-------------+-------------+-------------+ | Quantised | 165 K | 20 KB | 10.5 ms | 30 ms | | WakeNet5X2 | | | | | +-------------+-------------+-------------+-------------+-------------+ | Quantised | 371 K | 24 KB | 18 ms | 30 ms | | WakeNet5X3 | | | | | +-------------+-------------+-------------+-------------+-------------+ .. _resource-occupancyesp32s3-1: .. only:: esp32s3 +----------------+-------+---------+----------------+--------------+ | Model Type | RAM | PSRAM | Average | Frame Length | | | | | Running Time | | | | | | per Frame | | +================+=======+=========+================+==============+ | Quantised | 50 KB | 1640 KB | 10.0 ms | 32 ms | | WakeNet8 @ 2 | | | | | | channel | | | | | +----------------+-------+---------+----------------+--------------+ | Quantised | 16 KB | 324 KB | 3.0 ms | 32 ms | | WakeNet9 @ 2 | | | | | | channel | | | | | +----------------+-------+---------+----------------+--------------+ | Quantised | 20 KB | 347 KB | 4.3 ms | 32 ms | | WakeNet9 @ 3 | | | | | | channel | | | | | +----------------+-------+---------+----------------+--------------+ .. only:: esp32p4 +----------------+-------+---------+----------------+--------------+ | Model Type | RAM | PSRAM | Average | Frame Length | | | | | Running Time | | | | | | per Frame | | +================+=======+=========+================+==============+ | Quantised | 16 KB | 324 KB | 2.6 ms | 32 ms | | WakeNet9 @ 2 | | | | | | channel | | | | | +----------------+-------+---------+----------------+--------------+ | Quantised | 20 KB | 347 KB | 3.1 ms | 32 ms | | WakeNet9 @ 3 | | | | | | channel | | | | | +----------------+-------+---------+----------------+--------------+ Performance Test ~~~~~~~~~~~~~~~~ +-------------+-------------+-------------+-------------+-------------+ | Distance | Quiet | Stationary | Speech | AEC | | | | Noise (SNR | Noise (SNR | I | | | | = 4 dB) | = 4 dB) | nterruption | | | | | | (-10 dB) | +=============+=============+=============+=============+=============+ | 1 m | 98% | 96% | 94% | 96% | +-------------+-------------+-------------+-------------+-------------+ | 3 m | 98% | 96% | 94% | 94% | +-------------+-------------+-------------+-------------+-------------+ False triggering rate: once in 12 hours .. note:: In this test, we used ESP32-S3-Korvo V4.0 development board and WakeNet9(Alexa) model. MultiNet -------- .. _resource-occupancyesp32-2: Resource Consumption ~~~~~~~~~~~~~~~~~~~~ .. only:: esp32 +-------------+-------------+-------------+-------------+-------------+ | Model Type | Internal | PSRAM | Average | Frame | | | RAM | | Running | Length | | | | | Time per | | | | | | Frame | | +=============+=============+=============+=============+=============+ | MultiNet 2 | 13.3 KB | 9KB | 38 ms | 30 ms | +-------------+-------------+-------------+-------------+-------------+ .. _resource-occupancyesp32s3-2: .. only:: esp32s3 +-------------+-------------+-------------+-------------+-------------+ | Model Type | Internal | PSRAM | Average | Frame | | | RAM | | Running | Length | | | | | Time per | | | | | | Frame | | +=============+=============+=============+=============+=============+ | MultiNet 4 | 16.8KB | 1866 KB | 18 ms | 32 ms | +-------------+-------------+-------------+-------------+-------------+ | MultiNet 4 | 10.5 KB | 1009 KB | 11 ms | 32 ms | | Q8 | | | | | +-------------+-------------+-------------+-------------+-------------+ | MultiNet 5 | 16 KB | 2310 KB | 12 ms | 32 ms | | Q8 | | | | | +-------------+-------------+-------------+-------------+-------------+ | MultiNet 6 | 32 KB | 4100 KB | 12 ms | 32 ms | +-------------+-------------+-------------+-------------+-------------+ | MultiNet 7 | 18 KB | 2920 KB | 11 ms | 32 ms | +-------------+-------------+-------------+-------------+-------------+ .. only:: esp32p4 +-------------+-------------+-------------+-------------+-------------+ | Model Type | Internal | PSRAM | Average | Frame | | | RAM | | Running | Length | | | | | Time per | | | | | | Frame | | +=============+=============+=============+=============+=============+ | MultiNet 7 | 18 KB | 2920 KB | 8 ms | 32 ms | +-------------+-------------+-------------+-------------+-------------+ Word Error Rate Performance Test ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +-------------+-------------+-------------+ | Model | librispeech | librispeech | | Type | test-clean | test-other | +=============+=============+=============+ | MultiNet5-en| 16.5% | 41.4% | +-------------+-------------+-------------+ | MultiNet6-en| 9.0% | 21.3% | +-------------+-------------+-------------+ | MultiNet7-en| 8.5% | 21.3% | +-------------+-------------+-------------+ Speech Commands Performance Test ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +-----------+-----------+----------+------------+-------------+ | Model | Distance | Quiet | Stationary | Speech | | Type | | | Noise | Noise | | | | | (SNR=5~10dB| (SNR=5~10dB | | | | | dB) | dB) | +===========+===========+==========+============+=============+ | MultiNet | 3 m | 95.4% | 85.9% | 82.7% | | 5_en | | | | | +-----------+-----------+----------+------------+-------------+ | MultiNet | 3 m | 96.8% | 87.9% | 85.5% | | 6_en | | | | | +-----------+-----------+----------+------------+-------------+ | MultiNet | 3 m | 97.2% | 92.3% | 90.6% | | 7_en | | | | | +-----------+-----------+----------+------------+-------------+ TTS --- Resource Consumption ~~~~~~~~~~~~~~~~~~~~ Flash image size: 2.2 MB RAM runtime: 20 KB Performance Test ~~~~~~~~~~~~~~~~ CPU loading test (ESP32 @240 MHz): +------------------------------+------+------+------+------+------+------+ | Speech Rate | 0 | 1 | 2 | 3 | 4 | 5 | +==============================+======+======+======+======+======+======+ | Times faster than real time | 4.5 | 3.2 | 2.9 | 2.5 | 2.2 | 1.8 | +------------------------------+------+------+------+------+------+------+