Update wav_utils.py

Because there are no uppercase letters in the dictionary, when there are uppercase letters in the annotated text, the finetune result will be "unk", so uniformly converted to lowercase when read the annotated text.
2025-09-15 14:48:36 +08:00 · 2023-03-03 10:33:51 +08:00 · 2023-03-03 10:33:51 +08:00 · 1a39b6f981
commit 1a39b6f981
parent 5d4b0c3994
1 changed files with 1 additions and 1 deletions
--- a/funasr/utils/wav_utils.py
+++ b/funasr/utils/wav_utils.py
@ -309,7 +309,7 @@ def filter_wav_text(data_dir, dataset):
        if len(parts) < 2:
            continue
        sample_name = parts[0]
-        text_dict[sample_name] = " ".join(parts[1:])
+        text_dict[sample_name] = " ".join(parts[1:]).lower()
    filter_count = 0
    with open(wav_file, "w") as f_wav, open(text_file, "w") as f_text:
        for sample_name, wav_path in wav_dict.items():