mirror of
https://github.com/modelscope/FunASR
synced 2025-09-15 14:48:36 +08:00
* sensevoice finetune * sensevoice finetune * sensevoice finetune * sensevoice finetune * sensevoice finetune * sensevoice finetune * sensevoice finetune * sensevoice finetune * sensevoice finetune * sensevoice finetune * bugfix * update with main (#1631) * update seaco finetune * v1.0.24 --------- Co-authored-by: 维石 <shixian.shi@alibaba-inc.com> * sensevoice * sensevoice * sensevoice * update with main (#1638) * update seaco finetune * v1.0.24 * update rwkv template --------- Co-authored-by: 维石 <shixian.shi@alibaba-inc.com> * sensevoice * sensevoice * sensevoice * sensevoice * sensevoice * sensevoice * sensevoice * sensevoice * sensevoice * sensevoice * sensevoice * sensevoice * sensevoice * sensevoice * sensevoice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * sense voice * whisper * whisper * update style * update style --------- Co-authored-by: 维石 <shixian.shi@alibaba-inc.com> |
||
|---|---|---|
| .. | ||
| data | ||
| taggers | ||
| verbalizers | ||
| __init__.py | ||
| graph_utils.py | ||
| README.md | ||
| utils.py | ||
Note on French spelling
Due to a 1990 orthographic reform, there are currently two conventions for written French numbers:
-
Reformed All composite words are joined by a hyphen: e.g.
1122 -> mille-cent-vingt-deux -
Traditional Hyphenation only occurs (with exception) for numbers from 17 to 99 (inclusive): e.g.
1122 -> mille cent vingt-deux
As available training data for upstream ASR will vary in use of convention, NeMo's French ITN accomodates either style for normalization e.g.
python inverse_normalize.py "mille-cent-vingt-deux" --language="fr" --> 1122
python inverse_normalize.py "mille cent vingt-deux" --language="fr" --> 1122
As a result, there exists some ambiguity in the case of currency conversions, namely minor denominations of the dollar e.g.
300 -> "trois-cents" # Reformed spelling
300 -> "trois cents" # Traditional spelling
3 ¢ -> "trois cents" # Valid for both
Cardinals take priority in such cases.
python inverse_normalize.py "trois cents" --language="fr" -> 300