mirror of
https://github.com/modelscope/FunASR
synced 2025-09-15 14:48:36 +08:00
docs
This commit is contained in:
parent
c6ae045608
commit
5196e7afd7
@ -1,4 +1,33 @@
|
||||
# Papers
|
||||
|
||||
FunASR have implemented the following paper code
|
||||
|
||||
### Speech Recognition Models
|
||||
- [Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition](https://arxiv.org/abs/2206.08317), INTERSPEECH 2022.
|
||||
- [Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model](https://arxiv.org/abs/2010.14099), arXiv preprint arXiv:2010.14099, 2020.
|
||||
- [Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition](https://arxiv.org/abs/2206.08317), INTERSPEECH 2022.
|
||||
- [San-m: Memory equipped self-attention for end-to-end speech recognition](https://arxiv.org/pdf/2006.01713), INTERSPEECH 2020
|
||||
- [Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition](https://arxiv.org/abs/2006.01712), INTERSPEECH 2020
|
||||
- [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100), INTERSPEECH 2020
|
||||
- [Sequence-to-sequence learning with Transducers](https://arxiv.org/pdf/1211.3711.pdf), NIPS 2016
|
||||
|
||||
|
||||
### Multi-talker Speech Recognition Models
|
||||
- [MFCCA:Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario](https://arxiv.org/abs/2210.05265), ICASSP 2022
|
||||
|
||||
### Voice Activity Detection
|
||||
- [Deep-FSMN for Large Vocabulary Continuous Speech Recognition](https://arxiv.org/abs/1803.05030), ICASSP 2018
|
||||
|
||||
### Punctuation Restoration
|
||||
- [CT-Transformer: Controllable time-delay transformer for real-time punctuation prediction and disfluency detection](https://arxiv.org/pdf/2003.01309.pdf), ICASSP 2018
|
||||
|
||||
### Language Models
|
||||
- [Attention Is All You Need](https://arxiv.org/abs/1706.03762), NEURIPS 2017
|
||||
|
||||
### Speaker Verification
|
||||
- [X-VECTORS: ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION](https://www.danielpovey.com/files/2018_icassp_xvectors.pdf), ICASSP 2018
|
||||
|
||||
### Speaker diarization
|
||||
- [Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis](https://arxiv.org/abs/2211.10243), EMNLP 2022
|
||||
|
||||
### Timestamp Prediction
|
||||
- [Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model](https://arxiv.org/abs/2301.12343), arXiv:2301.12343
|
||||
|
||||
Loading…
Reference in New Issue
Block a user