Saved in:
| Main Authors: | Xi, Yu, Ding, Wen, Yu, Kai, Lai, Junjie |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.04219 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Romanization Encoding For Multilingual ASR
by: Ding, Wen, et al.
Published: (2024)
by: Ding, Wen, et al.
Published: (2024)
Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering
by: Carofilis, Andres, et al.
Published: (2025)
by: Carofilis, Andres, et al.
Published: (2025)
Align-Consistency: Improving Non-autoregressive and Semi-supervised ASR with Consistency Regularization
by: Huang, Wanting, et al.
Published: (2026)
by: Huang, Wanting, et al.
Published: (2026)
MOSA: Mixtures of Simple Adapters Outperform Monolithic Approaches in LLM-based Multilingual ASR
by: Li, Junjie, et al.
Published: (2025)
by: Li, Junjie, et al.
Published: (2025)
TC-BiMamba: Trans-Chunk bidirectionally within BiMamba for unified streaming and non-streaming ASR
by: She, Qingshun, et al.
Published: (2026)
by: She, Qingshun, et al.
Published: (2026)
DM-ASR: Diarization-aware Multi-speaker ASR with Large Language Models
by: Li, Li, et al.
Published: (2026)
by: Li, Li, et al.
Published: (2026)
Doctor or Patient? Synergizing Diarization and ASR for Code-Switched Hinglish Medical Conditions Extraction
by: Baroudi, Séverin, et al.
Published: (2026)
by: Baroudi, Séverin, et al.
Published: (2026)
AsyncSwitch: Asynchronous Text-Speech Adaptation for Code-Switched ASR
by: Nguyen, Tuan, et al.
Published: (2025)
by: Nguyen, Tuan, et al.
Published: (2025)
LESS: Large Language Model Enhanced Semi-Supervised Learning for Speech Foundational Models Using in-the-wild Data
by: Ding, Wen, et al.
Published: (2025)
by: Ding, Wen, et al.
Published: (2025)
Advancing Multi-talker ASR Performance with Large Language Models
by: Shi, Mohan, et al.
Published: (2024)
by: Shi, Mohan, et al.
Published: (2024)
A Survey on Speech Large Language Models for Understanding
by: Peng, Jing, et al.
Published: (2024)
by: Peng, Jing, et al.
Published: (2024)
Optimizing ASR for Catalan-Spanish Code-Switching: A Comparative Analysis of Methodologies
by: Mena, Carlos, et al.
Published: (2025)
by: Mena, Carlos, et al.
Published: (2025)
Masked Self-distilled Transducer-based Keyword Spotting with Semi-autoregressive Decoding
by: Xi, Yu, et al.
Published: (2025)
by: Xi, Yu, et al.
Published: (2025)
FairASR: Fair Audio Contrastive Learning for Automatic Speech Recognition
by: Kim, Jongsuk, et al.
Published: (2025)
by: Kim, Jongsuk, et al.
Published: (2025)
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision
by: Yang, Chih-Kai, et al.
Published: (2023)
by: Yang, Chih-Kai, et al.
Published: (2023)
Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM
by: Zhang, Fengrun, et al.
Published: (2024)
by: Zhang, Fengrun, et al.
Published: (2024)
TASU: Text-Only Alignment for Speech Understanding
by: Peng, Jing, et al.
Published: (2025)
by: Peng, Jing, et al.
Published: (2025)
NTC-KWS: Noise-aware CTC for Robust Keyword Spotting
by: Xi, Yu, et al.
Published: (2024)
by: Xi, Yu, et al.
Published: (2024)
Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning
by: Kong, YuXiang, et al.
Published: (2025)
by: Kong, YuXiang, et al.
Published: (2025)
SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR
by: Ye, Shuaishuai, et al.
Published: (2024)
by: Ye, Shuaishuai, et al.
Published: (2024)
Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario
by: Wen, Wen, et al.
Published: (2024)
by: Wen, Wen, et al.
Published: (2024)
ASR-EC Benchmark: Evaluating Large Language Models on Chinese ASR Error Correction
by: Wei, Victor Junqiu, et al.
Published: (2024)
by: Wei, Victor Junqiu, et al.
Published: (2024)
CAMEL: Cross-Attention Enhanced Mixture-of-Experts and Language Bias for Code-Switching Speech Recognition
by: Wang, He, et al.
Published: (2024)
by: Wang, He, et al.
Published: (2024)
Large Language Models based ASR Error Correction for Child Conversations
by: Xu, Anfeng, et al.
Published: (2025)
by: Xu, Anfeng, et al.
Published: (2025)
AdaCS: Adaptive Normalization for Enhanced Code-Switching ASR
by: Chu, The Chuong, et al.
Published: (2025)
by: Chu, The Chuong, et al.
Published: (2025)
Robust ASR Error Correction with Conservative Data Filtering
by: Udagawa, Takuma, et al.
Published: (2024)
by: Udagawa, Takuma, et al.
Published: (2024)
Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment
by: Shao, Yiwen, et al.
Published: (2024)
by: Shao, Yiwen, et al.
Published: (2024)
UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
by: Guan, Wenhao, et al.
Published: (2025)
by: Guan, Wenhao, et al.
Published: (2025)
ASR Error Correction using Large Language Models
by: Ma, Rao, et al.
Published: (2024)
by: Ma, Rao, et al.
Published: (2024)
Boosting Multi-Speaker Expressive Speech Synthesis with Semi-supervised Contrastive Learning
by: Zhu, Xinfa, et al.
Published: (2023)
by: Zhu, Xinfa, et al.
Published: (2023)
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition
by: Hu, Shujie, et al.
Published: (2024)
by: Hu, Shujie, et al.
Published: (2024)
Benchmarking Children's ASR with Supervised and Self-supervised Speech Foundation Models
by: Fan, Ruchao, et al.
Published: (2024)
by: Fan, Ruchao, et al.
Published: (2024)
Federated Learning of Large ASR Models in the Real World
by: Xiao, Yonghui, et al.
Published: (2024)
by: Xiao, Yonghui, et al.
Published: (2024)
Contrastive Learning With Audio Discrimination For Customizable Keyword Spotting In Continuous Speech
by: Xi, Yu, et al.
Published: (2024)
by: Xi, Yu, et al.
Published: (2024)
Inverse-Hessian Regularization for Continual Learning in ASR
by: Eeckt, Steven Vander, et al.
Published: (2026)
by: Eeckt, Steven Vander, et al.
Published: (2026)
Efficient Scaling for LLM-based ASR
by: Mu, Bingshen, et al.
Published: (2025)
by: Mu, Bingshen, et al.
Published: (2025)
Context and System Fusion in Post-ASR Emotion Recognition with Large Language Models
by: Stepachev, Pavel, et al.
Published: (2024)
by: Stepachev, Pavel, et al.
Published: (2024)
All-in-One ASR: Unifying Encoder-Decoder Models of CTC, Attention, and Transducer in Dual-Mode ASR
by: Moriya, Takafumi, et al.
Published: (2025)
by: Moriya, Takafumi, et al.
Published: (2025)
End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning
by: Ren, Zhao, et al.
Published: (2025)
by: Ren, Zhao, et al.
Published: (2025)
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
by: Yang, Guanrou, et al.
Published: (2024)
by: Yang, Guanrou, et al.
Published: (2024)
Similar Items
-
Romanization Encoding For Multilingual ASR
by: Ding, Wen, et al.
Published: (2024) -
Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering
by: Carofilis, Andres, et al.
Published: (2025) -
Align-Consistency: Improving Non-autoregressive and Semi-supervised ASR with Consistency Regularization
by: Huang, Wanting, et al.
Published: (2026) -
MOSA: Mixtures of Simple Adapters Outperform Monolithic Approaches in LLM-based Multilingual ASR
by: Li, Junjie, et al.
Published: (2025) -
TC-BiMamba: Trans-Chunk bidirectionally within BiMamba for unified streaming and non-streaming ASR
by: She, Qingshun, et al.
Published: (2026)