Saved in:
| Main Authors: | Xue, Hongfei, Gong, Rong, Shao, Mingchen, Xu, Xin, Wang, Lezhi, Xie, Lei, Bu, Hui, Zhou, Jiaming, Qin, Yong, Du, Jun, Li, Ming, Zhang, Binbin, Jia, Bin |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.05430 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection
by: Gong, Rong, et al.
Published: (2024)
by: Gong, Rong, et al.
Published: (2024)
Multilingual Stutter Event Detection for English, German, and Mandarin Speech
by: Haas, Felix, et al.
Published: (2026)
by: Haas, Felix, et al.
Published: (2026)
Leveraging LLM for Stuttering Speech: A Unified Architecture Bridging Recognition and Event Detection
by: Huang, Shangkun, et al.
Published: (2025)
by: Huang, Shangkun, et al.
Published: (2025)
FGCL: Fine-grained Contrastive Learning For Mandarin Stuttering Event Detection
by: Jiang, Han, et al.
Published: (2024)
by: Jiang, Han, et al.
Published: (2024)
WenetSpeech-Wu: Datasets, Benchmarks, and Models for a Unified Chinese Wu Dialect Speech Processing Ecosystem
by: Wang, Chengyou, et al.
Published: (2026)
by: Wang, Chengyou, et al.
Published: (2026)
The TEA-ASLP System for Multilingual Conversational Speech Recognition and Speech Diarization in MLC-SLM 2025 Challenge
by: Xue, Hongfei, et al.
Published: (2025)
by: Xue, Hongfei, et al.
Published: (2025)
SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition
by: Xue, Hongfei, et al.
Published: (2023)
by: Xue, Hongfei, et al.
Published: (2023)
ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5
by: Zhou, Jiaming, et al.
Published: (2024)
by: Zhou, Jiaming, et al.
Published: (2024)
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge
by: Wang, He, et al.
Published: (2024)
by: Wang, He, et al.
Published: (2024)
The RoyalFlush Automatic Speech Diarization and Recognition System for In-Car Multi-Channel Automatic Speech Recognition Challenge
by: Tian, Jingguang, et al.
Published: (2024)
by: Tian, Jingguang, et al.
Published: (2024)
CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition
by: Zhou, Jiaming, et al.
Published: (2025)
by: Zhou, Jiaming, et al.
Published: (2025)
Selective Invocation for Multilingual ASR: A Cost-effective Approach Adapting to Speech Recognition Difficulty
by: Xue, Hongfei, et al.
Published: (2025)
by: Xue, Hongfei, et al.
Published: (2025)
A Self-Training Approach for Whisper to Enhance Long Dysarthric Speech Recognition
by: Wang, Shiyao, et al.
Published: (2025)
by: Wang, Shiyao, et al.
Published: (2025)
Fine-Tuning ASR for Stuttered Speech: Personalized vs. Generalized Approaches
by: Mujtaba, Dena, et al.
Published: (2025)
by: Mujtaba, Dena, et al.
Published: (2025)
The DKU System for Multi-Speaker Automatic Speech Recognition in MLC-SLM Challenge
by: Lin, Yuke, et al.
Published: (2025)
by: Lin, Yuke, et al.
Published: (2025)
Investigation of Deep Neural Network Acoustic Modelling Approaches for Low Resource Accented Mandarin Speech Recognition
by: Xie, Xurong, et al.
Published: (2022)
by: Xie, Xurong, et al.
Published: (2022)
Iterative Prototype Refinement for Ambiguous Speech Emotion Recognition
by: Sun, Haoqin, et al.
Published: (2024)
by: Sun, Haoqin, et al.
Published: (2024)
AISHELL-5: The First Open-Source In-Car Multi-Channel Multi-Speaker Speech Dataset for Automatic Speech Diarization and Recognition
by: Dai, Yuhang, et al.
Published: (2025)
by: Dai, Yuhang, et al.
Published: (2025)
EffectiveASR: A Single-Step Non-Autoregressive Mandarin Speech Recognition Architecture with High Accuracy and Inference Speed
by: Zhuang, Ziyang, et al.
Published: (2024)
by: Zhuang, Ziyang, et al.
Published: (2024)
PB-LRDWWS System for the SLT 2024 Low-Resource Dysarthria Wake-Up Word Spotting Challenge
by: Wang, Shiyao, et al.
Published: (2024)
by: Wang, Shiyao, et al.
Published: (2024)
Seeing the Context: Rich Visual Context-Aware Speech Recognition via Multimodal Reasoning
by: Tian, Wenjie, et al.
Published: (2026)
by: Tian, Wenjie, et al.
Published: (2026)
Large Language Models for Dysfluency Detection in Stuttered Speech
by: Wagner, Dominik, et al.
Published: (2024)
by: Wagner, Dominik, et al.
Published: (2024)
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition
by: Zhang, Tian-Hao, et al.
Published: (2023)
by: Zhang, Tian-Hao, et al.
Published: (2023)
The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization
by: Cornell, Samuele, et al.
Published: (2024)
by: Cornell, Samuele, et al.
Published: (2024)
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration
by: Xu, Kai-Tuo, et al.
Published: (2025)
by: Xu, Kai-Tuo, et al.
Published: (2025)
Summary on The Multilingual Conversational Speech Language Model Challenge: Datasets, Tasks, Baselines, and Methods
by: Mu, Bingshen, et al.
Published: (2025)
by: Mu, Bingshen, et al.
Published: (2025)
AISHELL6-whisper: A Chinese Mandarin Audio-visual Whisper Speech Dataset with Speech Recognition Baselines
by: Li, Cancan, et al.
Published: (2025)
by: Li, Cancan, et al.
Published: (2025)
MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation
by: Liu, Cheng, et al.
Published: (2025)
by: Liu, Cheng, et al.
Published: (2025)
Speech Emotion Recognition Using Fine-Tuned DWFormer:A Study on Track 1 of the IERPChallenge 2024
by: Wang, Honghong, et al.
Published: (2025)
by: Wang, Honghong, et al.
Published: (2025)
Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models
by: Alsayegh, Ali, et al.
Published: (2025)
by: Alsayegh, Ali, et al.
Published: (2025)
Augmenting Polish Automatic Speech Recognition System With Synthetic Data
by: Bondaruk, Łukasz, et al.
Published: (2024)
by: Bondaruk, Łukasz, et al.
Published: (2024)
Leveraging Self-Supervised Models for Automatic Whispered Speech Recognition
by: Farhadipour, Aref, et al.
Published: (2024)
by: Farhadipour, Aref, et al.
Published: (2024)
Rehearsal-Free Online Continual Learning for Automatic Speech Recognition
by: Eeckt, Steven Vander, et al.
Published: (2023)
by: Eeckt, Steven Vander, et al.
Published: (2023)
Training Data Augmentation for Dysarthric Automatic Speech Recognition by Text-to-Dysarthric-Speech Synthesis
by: Leung, Wing-Zin, et al.
Published: (2024)
by: Leung, Wing-Zin, et al.
Published: (2024)
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
by: Chang, Xuankai, et al.
Published: (2024)
by: Chang, Xuankai, et al.
Published: (2024)
Zero Shot Text to Speech Augmentation for Automatic Speech Recognition on Low-Resource Accented Speech Corpora
by: Nespoli, Francesco, et al.
Published: (2024)
by: Nespoli, Francesco, et al.
Published: (2024)
Double Multi-Head Attention Multimodal System for Odyssey 2024 Speech Emotion Recognition Challenge
by: Costa, Federico, et al.
Published: (2024)
by: Costa, Federico, et al.
Published: (2024)
MNV-17: A High-Quality Performative Mandarin Dataset for Nonverbal Vocalization Recognition in Speech
by: Mai, Jialong, et al.
Published: (2025)
by: Mai, Jialong, et al.
Published: (2025)
Automatic Speech Recognition for Hindi
by: Saha, Anish, et al.
Published: (2024)
by: Saha, Anish, et al.
Published: (2024)
Ideal-LLM: Integrating Dual Encoders and Language-Adapted LLM for Multilingual Speech-to-Text
by: Xue, Hongfei, et al.
Published: (2024)
by: Xue, Hongfei, et al.
Published: (2024)
Similar Items
-
AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection
by: Gong, Rong, et al.
Published: (2024) -
Multilingual Stutter Event Detection for English, German, and Mandarin Speech
by: Haas, Felix, et al.
Published: (2026) -
Leveraging LLM for Stuttering Speech: A Unified Architecture Bridging Recognition and Event Detection
by: Huang, Shangkun, et al.
Published: (2025) -
FGCL: Fine-grained Contrastive Learning For Mandarin Stuttering Event Detection
by: Jiang, Han, et al.
Published: (2024) -
WenetSpeech-Wu: Datasets, Benchmarks, and Models for a Unified Chinese Wu Dialect Speech Processing Ecosystem
by: Wang, Chengyou, et al.
Published: (2026)