Saved in:
| Main Authors: | Xie, Yuan, Song, Jiaqi, Qiu, Guang, Wang, Xianliang, Qiao, Kai, Yuan, Junfeng, Liu, Shengqing, Zhang, Yi, Chen, Bowen, Lei, Ming, Gao, Jie, Wu, Jie |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.18105 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Rethinking Entropy Allocation in LLM-based ASR: Understanding the Dynamics between Speech Encoders and LLMs
by: Xie, Yuan, et al.
Published: (2026)
by: Xie, Yuan, et al.
Published: (2026)
Efficient Scaling for LLM-based ASR
by: Mu, Bingshen, et al.
Published: (2025)
by: Mu, Bingshen, et al.
Published: (2025)
Delayed-KD: Delayed Knowledge Distillation based CTC for Low-Latency Streaming ASR
by: Li, Longhao, et al.
Published: (2025)
by: Li, Longhao, et al.
Published: (2025)
Weakly Supervised Data Refinement and Flexible Sequence Compression for Efficient Thai LLM-based ASR
by: Shao, Mingchen, et al.
Published: (2025)
by: Shao, Mingchen, et al.
Published: (2025)
dLLM-ASR: A Faster Diffusion LLM-based Framework for Speech Recognition
by: Tian, Wenjie, et al.
Published: (2026)
by: Tian, Wenjie, et al.
Published: (2026)
Towards Robust Dysarthric Speech Recognition: LLM-Agent Post-ASR Correction Beyond WER
by: Zheng, Xiuwen, et al.
Published: (2026)
by: Zheng, Xiuwen, et al.
Published: (2026)
Towards Decoupling Frontend Enhancement and Backend Recognition in Monaural Robust ASR
by: Yang, Yufeng, et al.
Published: (2024)
by: Yang, Yufeng, et al.
Published: (2024)
BR-ASR: Efficient and Scalable Bias Retrieval Framework for Contextual Biasing ASR in Speech LLM
by: Gong, Xun, et al.
Published: (2025)
by: Gong, Xun, et al.
Published: (2025)
Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR
by: Lin, Zhennan, et al.
Published: (2026)
by: Lin, Zhennan, et al.
Published: (2026)
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
by: Nguyen, Thai-Binh, et al.
Published: (2024)
by: Nguyen, Thai-Binh, et al.
Published: (2024)
The USTC-NERCSLIP Systems for The ICMC-ASR Challenge
by: Wu, Minghui, et al.
Published: (2024)
by: Wu, Minghui, et al.
Published: (2024)
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration
by: Xu, Kai-Tuo, et al.
Published: (2025)
by: Xu, Kai-Tuo, et al.
Published: (2025)
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
by: Guo, Pengcheng, et al.
Published: (2024)
by: Guo, Pengcheng, et al.
Published: (2024)
Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses
by: Yang, Yufeng, et al.
Published: (2025)
by: Yang, Yufeng, et al.
Published: (2025)
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
by: Xie, Jiamin, et al.
Published: (2023)
by: Xie, Jiamin, et al.
Published: (2023)
HDMoLE: Mixture of LoRA Experts with Hierarchical Routing and Dynamic Thresholds for Fine-Tuning LLM-based ASR Models
by: Mu, Bingshen, et al.
Published: (2024)
by: Mu, Bingshen, et al.
Published: (2024)
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
by: Bai, Ye, et al.
Published: (2024)
by: Bai, Ye, et al.
Published: (2024)
Target Speaker ASR with Whisper
by: Polok, Alexander, et al.
Published: (2024)
by: Polok, Alexander, et al.
Published: (2024)
Index-ASR Technical Report
by: Song, Zheshu, et al.
Published: (2025)
by: Song, Zheshu, et al.
Published: (2025)
PromptASR for contextualized ASR with controllable style
by: Yang, Xiaoyu, et al.
Published: (2023)
by: Yang, Xiaoyu, et al.
Published: (2023)
Revisiting Acoustic Features for Robust ASR
by: Shah, Muhammad A., et al.
Published: (2024)
by: Shah, Muhammad A., et al.
Published: (2024)
Selective Invocation for Multilingual ASR: A Cost-effective Approach Adapting to Speech Recognition Difficulty
by: Xue, Hongfei, et al.
Published: (2025)
by: Xue, Hongfei, et al.
Published: (2025)
Evolutionary Prompt Design for LLM-Based Post-ASR Error Correction
by: Sachdev, Rithik, et al.
Published: (2024)
by: Sachdev, Rithik, et al.
Published: (2024)
Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets
by: Geng, Xuelong, et al.
Published: (2024)
by: Geng, Xuelong, et al.
Published: (2024)
Speech Emotion Recognition with ASR Integration
by: Li, Yuanchao
Published: (2026)
by: Li, Yuanchao
Published: (2026)
Edge-ASR: Towards Low-Bit Quantization of Automatic Speech Recognition Models
by: Feng, Chen, et al.
Published: (2025)
by: Feng, Chen, et al.
Published: (2025)
Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASR
by: Wang, Weiqing, et al.
Published: (2024)
by: Wang, Weiqing, et al.
Published: (2024)
Efficient Rehearsal for Continual Learning in ASR via Singular Value Tuning
by: Eeckt, Steven Vander, et al.
Published: (2026)
by: Eeckt, Steven Vander, et al.
Published: (2026)
Elevating Robust Multi-Talker ASR by Decoupling Speaker Separation and Speech Recognition
by: Yang, Yufeng, et al.
Published: (2025)
by: Yang, Yufeng, et al.
Published: (2025)
Contrastive Learning With Audio Discrimination For Customizable Keyword Spotting In Continuous Speech
by: Xi, Yu, et al.
Published: (2024)
by: Xi, Yu, et al.
Published: (2024)
Towards Rehearsal-Free Multilingual ASR: A LoRA-based Case Study on Whisper
by: Xu, Tianyi, et al.
Published: (2024)
by: Xu, Tianyi, et al.
Published: (2024)
SpecASR: Accelerating LLM-based Automatic Speech Recognition via Speculative Decoding
by: Wei, Linye, et al.
Published: (2025)
by: Wei, Linye, et al.
Published: (2025)
VIBEVOICE-ASR Technical Report
by: Peng, Zhiliang, et al.
Published: (2026)
by: Peng, Zhiliang, et al.
Published: (2026)
AutoMode-ASR: Learning to Select ASR Systems for Better Quality and Cost
by: Gündüz, Ahmet, et al.
Published: (2024)
by: Gündüz, Ahmet, et al.
Published: (2024)
LUPET: Incorporating Hierarchical Information Path into Multilingual ASR
by: Liu, Wei, et al.
Published: (2024)
by: Liu, Wei, et al.
Published: (2024)
persoDA: Personalized Data Augmentation for Personalized ASR
by: Parada, Pablo Peso, et al.
Published: (2025)
by: Parada, Pablo Peso, et al.
Published: (2025)
Speaker Adaptation for Quantised End-to-End ASR Models
by: Zhao, Qiuming, et al.
Published: (2024)
by: Zhao, Qiuming, et al.
Published: (2024)
Comparative Analysis of ASR Methods for Speech Deepfake Detection
by: Salvi, Davide, et al.
Published: (2024)
by: Salvi, Davide, et al.
Published: (2024)
Consistency Based Unsupervised Self-training For ASR Personalisation
by: Zhang, Jisi, et al.
Published: (2024)
by: Zhang, Jisi, et al.
Published: (2024)
Synthetic Data Domain Adaptation for ASR via LLM-based Text and Phonetic Respelling Augmentation
by: Yamashita, Natsuo, et al.
Published: (2026)
by: Yamashita, Natsuo, et al.
Published: (2026)
Similar Items
-
Rethinking Entropy Allocation in LLM-based ASR: Understanding the Dynamics between Speech Encoders and LLMs
by: Xie, Yuan, et al.
Published: (2026) -
Efficient Scaling for LLM-based ASR
by: Mu, Bingshen, et al.
Published: (2025) -
Delayed-KD: Delayed Knowledge Distillation based CTC for Low-Latency Streaming ASR
by: Li, Longhao, et al.
Published: (2025) -
Weakly Supervised Data Refinement and Flexible Sequence Compression for Efficient Thai LLM-based ASR
by: Shao, Mingchen, et al.
Published: (2025) -
dLLM-ASR: A Faster Diffusion LLM-based Framework for Speech Recognition
by: Tian, Wenjie, et al.
Published: (2026)