diff --git a/.gitignore b/.gitignore index 33b8c3979..58bee36cf 100644 --- a/.gitignore +++ b/.gitignore @@ -16,4 +16,6 @@ MaaS-lib .egg* dist build -funasr.egg-info \ No newline at end of file +funasr.egg-info +docs/_build +modelscope \ No newline at end of file diff --git a/README.md b/README.md index 414eb9b89..e9c6ef9bb 100644 --- a/README.md +++ b/README.md @@ -13,22 +13,22 @@ | [**Highlights**](#highlights) | [**Installation**](#installation) | [**Docs**](https://alibaba-damo-academy.github.io/FunASR/en/index.html) -| [**Tutorial**](https://github.com/alibaba-damo-academy/FunASR/wiki#funasr%E7%94%A8%E6%88%B7%E6%89%8B%E5%86%8C) +| [**Tutorial_CN**](https://github.com/alibaba-damo-academy/FunASR/wiki#funasr%E7%94%A8%E6%88%B7%E6%89%8B%E5%86%8C) | [**Papers**](https://github.com/alibaba-damo-academy/FunASR#citations) | [**Runtime**](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime) -| [**Model Zoo**](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/modelscope_models.md) +| [**Model Zoo**](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md) | [**Contact**](#contact) | [**M2MET2.0 Challenge**](https://github.com/alibaba-damo-academy/FunASR#multi-channel-multi-party-meeting-transcription-20-m2met20-challenge) ## What's new: -### Multi-Channel Multi-Party Meeting Transcription 2.0 (M2MET2.0) Challenge -We are pleased to announce that the M2MeT2.0 challenge will be held in the near future. The baseline system is conducted on FunASR and is provided as a receipe of AliMeeting corpus. For more details you can see the guidence of M2MET2.0 ([CN](https://alibaba-damo-academy.github.io/FunASR/m2met2_cn/index.html)/[EN](https://alibaba-damo-academy.github.io/FunASR/m2met2/index.html)). +### Multi-Channel Multi-Party Meeting Transcription 2.0 (M2MeT2.0) Challenge +We are pleased to announce that the M2MeT2.0 challenge has been accepted by the ASRU 2023 challenge special session. The registration is now open. The baseline system is conducted on FunASR and is provided as a receipe of AliMeeting corpus. For more details you can see the guidence of M2MET2.0 ([CN](https://alibaba-damo-academy.github.io/FunASR/m2met2_cn/index.html)/[EN](https://alibaba-damo-academy.github.io/FunASR/m2met2/index.html)). ### Release notes For the release notes, please ref to [news](https://github.com/alibaba-damo-academy/FunASR/releases) ## Highlights - FunASR supports speech recognition(ASR), Multi-talker ASR, Voice Activity Detection(VAD), Punctuation Restoration, Language Models, Speaker Verification and Speaker diarization. -- We have released large number of academic and industrial pretrained models on [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition), ref to [Model Zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html) +- We have released large number of academic and industrial pretrained models on [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition), ref to [Model Zoo](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md) - The pretrained model [Paraformer-large](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) obtains the best performance on many tasks in [SpeechIO leaderboard](https://github.com/SpeechColab/Leaderboard) - FunASR supplies a easy-to-use pipeline to finetune pretrained models from [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition) - Compared to [Espnet](https://github.com/espnet/espnet) framework, the training speed of large-scale datasets in FunASR is much faster owning to the optimized dataloader. @@ -60,12 +60,8 @@ pip install -U modelscope # pip install -U modelscope -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html -i https://mirror.sjtu.edu.cn/pypi/web/simple ``` -For more details, please ref to [installation](https://alibaba-damo-academy.github.io/FunASR/en/installation.html) +For more details, please ref to [installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/installation.html) -[//]: # () -[//]: # (## Usage) - -[//]: # (For users who are new to FunASR and ModelScope, please refer to FunASR Docs([CN](https://alibaba-damo-academy.github.io/FunASR/cn/index.html) / [EN](https://alibaba-damo-academy.github.io/FunASR/en/index.html))) ## Contact diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 000000000..4e16b046f --- /dev/null +++ b/docs/README.md @@ -0,0 +1,19 @@ +# FunASR document generation + +## Generate HTML +For convenience, we provide users with the ability to generate local HTML manually. + +First, you should install the following packages, which is required for building HTML: +```sh +conda activate funasr +pip install requests sphinx nbsphinx sphinx_markdown_tables sphinx_rtd_theme recommonmark +``` + +Then you can generate HTML manually. + +```sh +cd docs +make html +``` + +The generated files are all contained in the "FunASR/docs/_build" directory. You can access the FunASR documentation by simply opening the "html/index.html" file in your browser from this directory. \ No newline at end of file diff --git a/docs/academic_recipe/lm_recipe.md b/docs/academic_recipe/lm_recipe.md index f82a6fee4..730e27c42 100644 --- a/docs/academic_recipe/lm_recipe.md +++ b/docs/academic_recipe/lm_recipe.md @@ -1,129 +1,3 @@ # Speech Recognition -Here we take "Training a paraformer model from scratch using the AISHELL-1 dataset" as an example to introduce how to use FunASR. According to this example, users can similarly employ other datasets (such as AISHELL-2 dataset, etc.) to train other models (such as conformer, transformer, etc.). - -## Overall Introduction -We provide a recipe `egs/aishell/paraformer/run.sh` for training a paraformer model on AISHELL-1 dataset. This recipe consists of five stages, supporting training on multiple GPUs and decoding by CPU or GPU. Before introducing each stage in detail, we first explain several parameters which should be set by users. -- `CUDA_VISIBLE_DEVICES`: visible gpu list -- `gpu_num`: the number of GPUs used for training -- `gpu_inference`: whether to use GPUs for decoding -- `njob`: for CPU decoding, indicating the total number of CPU jobs; for GPU decoding, indicating the number of jobs on each GPU -- `data_aishell`: the raw path of AISHELL-1 dataset -- `feats_dir`: the path for saving processed data -- `nj`: the number of jobs for data preparation -- `speed_perturb`: the range of speech perturbed -- `exp_dir`: the path for saving experimental results -- `tag`: the suffix of experimental result directory - -## Stage 0: Data preparation -This stage processes raw AISHELL-1 dataset `$data_aishell` and generates the corresponding `wav.scp` and `text` in `$feats_dir/data/xxx`. `xxx` means `train/dev/test`. Here we assume users have already downloaded AISHELL-1 dataset. If not, users can download data [here](https://www.openslr.org/33/) and set the path for `$data_aishell`. The examples of `wav.scp` and `text` are as follows: -* `wav.scp` -``` -BAC009S0002W0122 /nfs/ASR_DATA/AISHELL-1/data_aishell/wav/train/S0002/BAC009S0002W0122.wav -BAC009S0002W0123 /nfs/ASR_DATA/AISHELL-1/data_aishell/wav/train/S0002/BAC009S0002W0123.wav -BAC009S0002W0124 /nfs/ASR_DATA/AISHELL-1/data_aishell/wav/train/S0002/BAC009S0002W0124.wav -... -``` -* `text` -``` -BAC009S0002W0122 而 对 楼 市 成 交 抑 制 作 用 最 大 的 限 购 -BAC009S0002W0123 也 成 为 地 方 政 府 的 眼 中 钉 -BAC009S0002W0124 自 六 月 底 呼 和 浩 特 市 率 先 宣 布 取 消 限 购 后 -... -``` -These two files both have two columns, while the first column is wav ids and the second column is the corresponding wav paths/label tokens. - -## Stage 1: Feature Generation -This stage extracts FBank features from `wav.scp` and apply speed perturbation as data augmentation according to `speed_perturb`. Users can set `nj` to control the number of jobs for feature generation. The generated features are saved in `$feats_dir/dump/xxx/ark` and the corresponding `feats.scp` files are saved as `$feats_dir/dump/xxx/feats.scp`. An example of `feats.scp` can be seen as follows: -* `feats.scp` -``` -... -BAC009S0002W0122_sp0.9 /nfs/funasr_data/aishell-1/dump/fbank/train/ark/feats.16.ark:592751055 -... -``` -Note that samples in this file have already been shuffled randomly. This file contains two columns. The first column is wav ids while the second column is kaldi-ark feature paths. Besides, `speech_shape` and `text_shape` are also generated in this stage, denoting the speech feature shape and text length of each sample. The examples are shown as follows: -* `speech_shape` -``` -... -BAC009S0002W0122_sp0.9 665,80 -... -``` -* `text_shape` -``` -... -BAC009S0002W0122_sp0.9 15 -... -``` -These two files have two columns. The first column is wav ids and the second column is the corresponding speech feature shape and text length. - -## Stage 2: Dictionary Preparation -This stage processes the dictionary, which is used as a mapping between label characters and integer indices during ASR training. The processed dictionary file is saved as `$feats_dir/data/$lang_toekn_list/$token_type/tokens.txt`. An example of `tokens.txt` is as follows: -* `tokens.txt` -``` - - - -一 -丁 -... -龚 -龟 - -``` -* ``: indicates the blank token for CTC -* ``: indicates the start-of-sentence token -* ``: indicates the end-of-sentence token -* ``: indicates the out-of-vocabulary token - -## Stage 3: Training -This stage achieves the training of the specified model. To start training, users should manually set `exp_dir`, `CUDA_VISIBLE_DEVICES` and `gpu_num`, which have already been explained above. By default, the best `$keep_nbest_models` checkpoints on validation dataset will be averaged to generate a better model and adopted for decoding. - -* DDP Training - -We support the DistributedDataParallel (DDP) training and the detail can be found [here](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html). To enable DDP training, please set `gpu_num` greater than 1. For example, if you set `CUDA_VISIBLE_DEVICES=0,1,5,6,7` and `gpu_num=3`, then the gpus with ids 0, 1 and 5 will be used for training. - -* DataLoader - -We support an optional iterable-style DataLoader based on [Pytorch Iterable-style DataPipes](https://pytorch.org/data/beta/torchdata.datapipes.iter.html) for large dataset and users can set `dataset_type=large` to enable it. - -* Configuration - -The parameters of the training, including model, optimization, dataset, etc., can be set by a YAML file in `conf` directory. Also, users can directly set the parameters in `run.sh` recipe. Please avoid to set the same parameters in both the YAML file and the recipe. - -* Training Steps - -We support two parameters to specify the training steps, namely `max_epoch` and `max_update`. `max_epoch` indicates the total training epochs while `max_update` indicates the total training steps. If these two parameters are specified at the same time, once the training reaches any one of these two parameters, the training will be stopped. - -* Tensorboard - -Users can use tensorboard to observe the loss, learning rate, etc. Please run the following command: -``` -tensorboard --logdir ${exp_dir}/exp/${model_dir}/tensorboard/train -``` - -## Stage 4: Decoding -This stage generates the recognition results and calculates the `CER` to verify the performance of the trained model. - -* Mode Selection - -As we support paraformer, uniasr, conformer and other models in FunASR, a `mode` parameter should be specified as `asr/paraformer/uniasr` according to the trained model. - -* Configuration - -We support CTC decoding, attention decoding and hybrid CTC-attention decoding in FunASR, which can be specified by `ctc_weight` in a YAML file in `conf` directory. Specifically, `ctc_weight=1.0` indicates CTC decoding, `ctc_weight=0.0` indicates attention decoding, `0.0 - - -一 -丁 -... -龚 -龟 - -``` -* ``: indicates the blank token for CTC -* ``: indicates the start-of-sentence token -* ``: indicates the end-of-sentence token -* ``: indicates the out-of-vocabulary token - -## Stage 3: Training -This stage achieves the training of the specified model. To start training, users should manually set `exp_dir`, `CUDA_VISIBLE_DEVICES` and `gpu_num`, which have already been explained above. By default, the best `$keep_nbest_models` checkpoints on validation dataset will be averaged to generate a better model and adopted for decoding. - -* DDP Training - -We support the DistributedDataParallel (DDP) training and the detail can be found [here](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html). To enable DDP training, please set `gpu_num` greater than 1. For example, if you set `CUDA_VISIBLE_DEVICES=0,1,5,6,7` and `gpu_num=3`, then the gpus with ids 0, 1 and 5 will be used for training. - -* DataLoader - -We support an optional iterable-style DataLoader based on [Pytorch Iterable-style DataPipes](https://pytorch.org/data/beta/torchdata.datapipes.iter.html) for large dataset and users can set `dataset_type=large` to enable it. - -* Configuration - -The parameters of the training, including model, optimization, dataset, etc., can be set by a YAML file in `conf` directory. Also, users can directly set the parameters in `run.sh` recipe. Please avoid to set the same parameters in both the YAML file and the recipe. - -* Training Steps - -We support two parameters to specify the training steps, namely `max_epoch` and `max_update`. `max_epoch` indicates the total training epochs while `max_update` indicates the total training steps. If these two parameters are specified at the same time, once the training reaches any one of these two parameters, the training will be stopped. - -* Tensorboard - -Users can use tensorboard to observe the loss, learning rate, etc. Please run the following command: -``` -tensorboard --logdir ${exp_dir}/exp/${model_dir}/tensorboard/train -``` - -## Stage 4: Decoding -This stage generates the recognition results and calculates the `CER` to verify the performance of the trained model. - -* Mode Selection - -As we support paraformer, uniasr, conformer and other models in FunASR, a `mode` parameter should be specified as `asr/paraformer/uniasr` according to the trained model. - -* Configuration - -We support CTC decoding, attention decoding and hybrid CTC-attention decoding in FunASR, which can be specified by `ctc_weight` in a YAML file in `conf` directory. Specifically, `ctc_weight=1.0` indicates CTC decoding, `ctc_weight=0.0` indicates attention decoding, `0.0 - - -一 -丁 -... -龚 -龟 - -``` -* ``: indicates the blank token for CTC -* ``: indicates the start-of-sentence token -* ``: indicates the end-of-sentence token -* ``: indicates the out-of-vocabulary token - -## Stage 3: Training -This stage achieves the training of the specified model. To start training, users should manually set `exp_dir`, `CUDA_VISIBLE_DEVICES` and `gpu_num`, which have already been explained above. By default, the best `$keep_nbest_models` checkpoints on validation dataset will be averaged to generate a better model and adopted for decoding. - -* DDP Training - -We support the DistributedDataParallel (DDP) training and the detail can be found [here](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html). To enable DDP training, please set `gpu_num` greater than 1. For example, if you set `CUDA_VISIBLE_DEVICES=0,1,5,6,7` and `gpu_num=3`, then the gpus with ids 0, 1 and 5 will be used for training. - -* DataLoader - -We support an optional iterable-style DataLoader based on [Pytorch Iterable-style DataPipes](https://pytorch.org/data/beta/torchdata.datapipes.iter.html) for large dataset and users can set `dataset_type=large` to enable it. - -* Configuration - -The parameters of the training, including model, optimization, dataset, etc., can be set by a YAML file in `conf` directory. Also, users can directly set the parameters in `run.sh` recipe. Please avoid to set the same parameters in both the YAML file and the recipe. - -* Training Steps - -We support two parameters to specify the training steps, namely `max_epoch` and `max_update`. `max_epoch` indicates the total training epochs while `max_update` indicates the total training steps. If these two parameters are specified at the same time, once the training reaches any one of these two parameters, the training will be stopped. - -* Tensorboard - -Users can use tensorboard to observe the loss, learning rate, etc. Please run the following command: -``` -tensorboard --logdir ${exp_dir}/exp/${model_dir}/tensorboard/train -``` - -## Stage 4: Decoding -This stage generates the recognition results and calculates the `CER` to verify the performance of the trained model. - -* Mode Selection - -As we support paraformer, uniasr, conformer and other models in FunASR, a `mode` parameter should be specified as `asr/paraformer/uniasr` according to the trained model. - -* Configuration - -We support CTC decoding, attention decoding and hybrid CTC-attention decoding in FunASR, which can be specified by `ctc_weight` in a YAML file in `conf` directory. Specifically, `ctc_weight=1.0` indicates CTC decoding, `ctc_weight=0.0` indicates attention decoding, `0.0 - - -一 -丁 -... -龚 -龟 - -``` -* ``: indicates the blank token for CTC -* ``: indicates the start-of-sentence token -* ``: indicates the end-of-sentence token -* ``: indicates the out-of-vocabulary token - -## Stage 3: Training -This stage achieves the training of the specified model. To start training, users should manually set `exp_dir`, `CUDA_VISIBLE_DEVICES` and `gpu_num`, which have already been explained above. By default, the best `$keep_nbest_models` checkpoints on validation dataset will be averaged to generate a better model and adopted for decoding. - -* DDP Training - -We support the DistributedDataParallel (DDP) training and the detail can be found [here](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html). To enable DDP training, please set `gpu_num` greater than 1. For example, if you set `CUDA_VISIBLE_DEVICES=0,1,5,6,7` and `gpu_num=3`, then the gpus with ids 0, 1 and 5 will be used for training. - -* DataLoader - -We support an optional iterable-style DataLoader based on [Pytorch Iterable-style DataPipes](https://pytorch.org/data/beta/torchdata.datapipes.iter.html) for large dataset and users can set `dataset_type=large` to enable it. - -* Configuration - -The parameters of the training, including model, optimization, dataset, etc., can be set by a YAML file in `conf` directory. Also, users can directly set the parameters in `run.sh` recipe. Please avoid to set the same parameters in both the YAML file and the recipe. - -* Training Steps - -We support two parameters to specify the training steps, namely `max_epoch` and `max_update`. `max_epoch` indicates the total training epochs while `max_update` indicates the total training steps. If these two parameters are specified at the same time, once the training reaches any one of these two parameters, the training will be stopped. - -* Tensorboard - -Users can use tensorboard to observe the loss, learning rate, etc. Please run the following command: -``` -tensorboard --logdir ${exp_dir}/exp/${model_dir}/tensorboard/train -``` - -## Stage 4: Decoding -This stage generates the recognition results and calculates the `CER` to verify the performance of the trained model. - -* Mode Selection - -As we support paraformer, uniasr, conformer and other models in FunASR, a `mode` parameter should be specified as `asr/paraformer/uniasr` according to the trained model. - -* Configuration - -We support CTC decoding, attention decoding and hybrid CTC-attention decoding in FunASR, which can be specified by `ctc_weight` in a YAML file in `conf` directory. Specifically, `ctc_weight=1.0` indicates CTC decoding, `ctc_weight=0.0` indicates attention decoding, `0.0 - - -一 -丁 -... -龚 -龟 - -``` -* ``: indicates the blank token for CTC -* ``: indicates the start-of-sentence token -* ``: indicates the end-of-sentence token -* ``: indicates the out-of-vocabulary token - -## Stage 3: Training -This stage achieves the training of the specified model. To start training, users should manually set `exp_dir`, `CUDA_VISIBLE_DEVICES` and `gpu_num`, which have already been explained above. By default, the best `$keep_nbest_models` checkpoints on validation dataset will be averaged to generate a better model and adopted for decoding. - -* DDP Training - -We support the DistributedDataParallel (DDP) training and the detail can be found [here](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html). To enable DDP training, please set `gpu_num` greater than 1. For example, if you set `CUDA_VISIBLE_DEVICES=0,1,5,6,7` and `gpu_num=3`, then the gpus with ids 0, 1 and 5 will be used for training. - -* DataLoader - -We support an optional iterable-style DataLoader based on [Pytorch Iterable-style DataPipes](https://pytorch.org/data/beta/torchdata.datapipes.iter.html) for large dataset and users can set `dataset_type=large` to enable it. - -* Configuration - -The parameters of the training, including model, optimization, dataset, etc., can be set by a YAML file in `conf` directory. Also, users can directly set the parameters in `run.sh` recipe. Please avoid to set the same parameters in both the YAML file and the recipe. - -* Training Steps - -We support two parameters to specify the training steps, namely `max_epoch` and `max_update`. `max_epoch` indicates the total training epochs while `max_update` indicates the total training steps. If these two parameters are specified at the same time, once the training reaches any one of these two parameters, the training will be stopped. - -* Tensorboard - -Users can use tensorboard to observe the loss, learning rate, etc. Please run the following command: -``` -tensorboard --logdir ${exp_dir}/exp/${model_dir}/tensorboard/train -``` - -## Stage 4: Decoding -This stage generates the recognition results and calculates the `CER` to verify the performance of the trained model. - -* Mode Selection - -As we support paraformer, uniasr, conformer and other models in FunASR, a `mode` parameter should be specified as `asr/paraformer/uniasr` according to the trained model. - -* Configuration - -We support CTC decoding, attention decoding and hybrid CTC-attention decoding in FunASR, which can be specified by `ctc_weight` in a YAML file in `conf` directory. Specifically, `ctc_weight=1.0` indicates CTC decoding, `ctc_weight=0.0` indicates attention decoding, `0.0 | --> +| | diff --git a/docs/m2met2/Introduction.md b/docs/m2met2/Introduction.md index eac9eb6b5..fc7c356d7 100644 --- a/docs/m2met2/Introduction.md +++ b/docs/m2met2/Introduction.md @@ -6,23 +6,23 @@ Over the years, several challenges have been organized to advance the developmen The ICASSP2022 M2MeT challenge focuses on meeting scenarios, and it comprises two main tasks: speaker diarization and multi-speaker automatic speech recognition. The former involves identifying who spoke when in the meeting, while the latter aims to transcribe speech from multiple speakers simultaneously, which poses significant technical difficulties due to overlapping speech and acoustic interferences. -Building on the success of the previous M2MeT challenge, we are excited to propose the M2MeT2.0 challenge as an ASRU2023 challenge special session. In the original M2MeT challenge, the evaluation metric was speaker-independent, which meant that the transcription could be determined, but not the corresponding speaker. To address this limitation and further advance the current multi-talker ASR system towards practicality, the M2MeT2.0 challenge proposes the speaker-attributed ASR task with two sub-tracks: fixed and open training conditions. The speaker-attribute automatic speech recognition (ASR) task aims to tackle the practical and challenging problem of identifying "who spoke what at when". To facilitate reproducible research in this field, we offer a comprehensive overview of the dataset, rules, evaluation metrics, and baseline systems. Furthermore, we will release a carefully curated test set, comprising approximately 10 hours of audio, according to the timeline. The new test set is designed to enable researchers to validate and compare their models' performance and advance the state of the art in this area. +Building on the success of the previous M2MeT challenge, we are excited to propose the M2MeT2.0 challenge as an ASRU 2023 challenge special session. In the original M2MeT challenge, the evaluation metric was speaker-independent, which meant that the transcription could be determined, but not the corresponding speaker. To address this limitation and further advance the current multi-talker ASR system towards practicality, the M2MeT2.0 challenge proposes the speaker-attributed ASR task with two sub-tracks: fixed and open training conditions. The speaker-attribute automatic speech recognition (ASR) task aims to tackle the practical and challenging problem of identifying "who spoke what at when". To facilitate reproducible research in this field, we offer a comprehensive overview of the dataset, rules, evaluation metrics, and baseline systems. Furthermore, we will release a carefully curated test set, comprising approximately 10 hours of audio, according to the timeline. The new test set is designed to enable researchers to validate and compare their models' performance and advance the state of the art in this area. ## Timeline(AOE Time) - $ April~29, 2023: $ Challenge and registration open. -- $ May~8, 2023: $ Baseline release. -- $ May~15, 2023: $ Registration deadline, the due date for participants to join the Challenge. -- $ June~9, 2023: $ Test data release and leaderboard open. -- $ June~13, 2023: $ Final submission deadline. -- $ June~19, 2023: $ Evaluation result and ranking release. +- $ May~11, 2023: $ Baseline release. +- $ May~22, 2023: $ Registration deadline, the due date for participants to join the Challenge. +- $ June~16, 2023: $ Test data release and leaderboard open. +- $ June~20, 2023: $ Final submission deadline and leaderboar close. +- $ June~26, 2023: $ Evaluation result and ranking release. - $ July~3, 2023: $ Deadline for paper submission. - $ July~10, 2023: $ Deadline for final paper submission. -- $ December~12\ to\ 16, 2023: $ ASRU Workshop and challenge session +- $ December~12\ to\ 16, 2023: $ ASRU Workshop and Challenge Session. ## Guidelines -Interested participants, whether from academia or industry, must register for the challenge by completing the Google form below. The deadline for registration is May 15, 2023. +Interested participants, whether from academia or industry, must register for the challenge by completing the Google form below. The deadline for registration is May 22, 2023. Participants are also welcome to join the [wechat group](https://alibaba-damo-academy.github.io/FunASR/m2met2/Contact.html) of M2MeT2.0 and keep up to date with the latest updates about the challenge. -[M2MET2.0 Registration](https://docs.google.com/forms/d/e/1FAIpQLSf77T9vAl7Ym-u5g8gXu18SBofoWRaFShBo26Ym0-HDxHW9PQ/viewform?usp=sf_link) +[M2MeT2.0 Registration](https://docs.google.com/forms/d/e/1FAIpQLSf77T9vAl7Ym-u5g8gXu18SBofoWRaFShBo26Ym0-HDxHW9PQ/viewform?usp=sf_link) -Within three working days, the challenge organizer will send email invitations to eligible teams to participate in the challenge. All qualified teams are required to adhere to the challenge rules, which will be published on the challenge page. Prior to the ranking release time, each participant must submit a system description document detailing their approach and methods. The organizer will select the top three submissions to be included in the ASRU2023 Proceedings. +Within three working days, the challenge organizer will send email invitations to eligible teams to participate in the challenge. All qualified teams are required to adhere to the challenge rules, which will be published on the challenge page. Prior to the ranking release time, each participant must submit a system description document detailing their approach and methods. The organizer will select the top ranking submissions to be included in the ASRU2023 Proceedings. diff --git a/docs/m2met2/Organizers.md b/docs/m2met2/Organizers.md index e16c803af..f5a9da2ba 100644 --- a/docs/m2met2/Organizers.md +++ b/docs/m2met2/Organizers.md @@ -1,5 +1,5 @@ # Organizers -***Lei Xie, Professor, Northwestern Polytechnical University, China*** +***Lei Xie, Professor, AISHELL foundation, China*** Email: [lxie@nwpu.edu.cn](mailto:lxie@nwpu.edu.cn) diff --git a/docs/m2met2/_build/doctrees/Baseline.doctree b/docs/m2met2/_build/doctrees/Baseline.doctree index 9fc7c50bc..f6ea62f86 100644 Binary files a/docs/m2met2/_build/doctrees/Baseline.doctree and b/docs/m2met2/_build/doctrees/Baseline.doctree differ diff --git a/docs/m2met2/_build/doctrees/Contact.doctree b/docs/m2met2/_build/doctrees/Contact.doctree index e3f579ff5..0508819f9 100644 Binary files a/docs/m2met2/_build/doctrees/Contact.doctree and b/docs/m2met2/_build/doctrees/Contact.doctree differ diff --git a/docs/m2met2/_build/doctrees/Introduction.doctree b/docs/m2met2/_build/doctrees/Introduction.doctree index 84f1baaa6..6ffceef26 100644 Binary files a/docs/m2met2/_build/doctrees/Introduction.doctree and b/docs/m2met2/_build/doctrees/Introduction.doctree differ diff --git a/docs/m2met2/_build/doctrees/Organizers.doctree b/docs/m2met2/_build/doctrees/Organizers.doctree index 0f571a3b7..7ecfbdf5a 100644 Binary files a/docs/m2met2/_build/doctrees/Organizers.doctree and b/docs/m2met2/_build/doctrees/Organizers.doctree differ diff --git a/docs/m2met2/_build/doctrees/environment.pickle b/docs/m2met2/_build/doctrees/environment.pickle index ea9c740c1..fe6805987 100644 Binary files a/docs/m2met2/_build/doctrees/environment.pickle and b/docs/m2met2/_build/doctrees/environment.pickle differ diff --git a/docs/m2met2/_build/html/.buildinfo b/docs/m2met2/_build/html/.buildinfo index d62b4cf37..97d32c4e5 100644 --- a/docs/m2met2/_build/html/.buildinfo +++ b/docs/m2met2/_build/html/.buildinfo @@ -1,4 +1,4 @@ # Sphinx build info version 1 # This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. -config: 9907eab6bf227ca0fc6db297f26919da +config: a62852d90c3e533904d811bbf85f977d tags: 645f666f9bcd5a90fca523b33c5a78b7 diff --git a/docs/m2met2/_build/html/Baseline.html b/docs/m2met2/_build/html/Baseline.html index e52d32275..62c656cca 100644 --- a/docs/m2met2/_build/html/Baseline.html +++ b/docs/m2met2/_build/html/Baseline.html @@ -15,7 +15,7 @@ - Baseline — m2met2 documentation + Baseline — MULTI-PARTY MEETING TRANSCRIPTION CHALLENGE 2.0 @@ -44,7 +44,7 @@
  • previous |
  • - + @@ -55,7 +55,7 @@
    + index.html" class="text-logo">MULTI-PARTY MEETING TRANSCRIPTION CHALLENGE 2.0
    + index.html" class="text-logo">MULTI-PARTY MEETING TRANSCRIPTION CHALLENGE 2.0
    + index.html" class="text-logo">MULTI-PARTY MEETING TRANSCRIPTION CHALLENGE 2.0
    + index.html" class="text-logo">MULTI-PARTY MEETING TRANSCRIPTION CHALLENGE 2.0
    + index.html" class="text-logo">MULTI-PARTY MEETING TRANSCRIPTION CHALLENGE 2.0
    + index.html" class="text-logo">MULTI-PARTY MEETING TRANSCRIPTION CHALLENGE 2.0
    + index.html" class="text-logo">MULTI-PARTY MEETING TRANSCRIPTION CHALLENGE 2.0
    + index.html" class="text-logo">MULTI-PARTY MEETING TRANSCRIPTION CHALLENGE 2.0
    + #" class="text-logo">MULTI-PARTY MEETING TRANSCRIPTION CHALLENGE 2.0
    + index.html" class="text-logo">MULTI-PARTY MEETING TRANSCRIPTION CHALLENGE 2.0 @@ -47,7 +47,7 @@
    + index.html" class="text-logo">多通道多方会议转录挑战2.0
    + #" class="text-logo">多通道多方会议转录挑战2.0