Saved in:
| Main Author: | Yang, Yiru |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.10313 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Unified Denoising and Adaptation Framework for Self-Supervised Bengali Dialectal ASR
by: Biswas, Swadhin, et al.
Published: (2025)
by: Biswas, Swadhin, et al.
Published: (2025)
SAML: Speaker Adaptive Mixture of LoRA Experts for End-to-End ASR
by: Zhao, Qiuming, et al.
Published: (2024)
by: Zhao, Qiuming, et al.
Published: (2024)
Lightweight Target-Speaker-Based Overlap Transcription for Practical Streaming ASR
by: Pražák, Aleš, et al.
Published: (2025)
by: Pražák, Aleš, et al.
Published: (2025)
VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech
by: Kim, Heeseung, et al.
Published: (2024)
by: Kim, Heeseung, et al.
Published: (2024)
Efficient Multilingual ASR Finetuning via LoRA Language Experts
by: Li, Jiahong, et al.
Published: (2025)
by: Li, Jiahong, et al.
Published: (2025)
Synthetic Data Domain Adaptation for ASR via LLM-based Text and Phonetic Respelling Augmentation
by: Yamashita, Natsuo, et al.
Published: (2026)
by: Yamashita, Natsuo, et al.
Published: (2026)
Delayed-KD: Delayed Knowledge Distillation based CTC for Low-Latency Streaming ASR
by: Li, Longhao, et al.
Published: (2025)
by: Li, Longhao, et al.
Published: (2025)
SE/BN Adapter: Parametric Efficient Domain Adaptation for Speaker Recognition
by: Wang, Tianhao, et al.
Published: (2024)
by: Wang, Tianhao, et al.
Published: (2024)
HDMoLE: Mixture of LoRA Experts with Hierarchical Routing and Dynamic Thresholds for Fine-Tuning LLM-based ASR Models
by: Mu, Bingshen, et al.
Published: (2024)
by: Mu, Bingshen, et al.
Published: (2024)
Overlap-Adaptive Hybrid Speaker Diarization and ASR-Aware Observation Addition for MISP 2025 Challenge
by: Huang, Shangkun, et al.
Published: (2025)
by: Huang, Shangkun, et al.
Published: (2025)
Adapter-Based Multi-Agent AVSR Extension for Pre-Trained ASR Models
by: Simic, Christopher, et al.
Published: (2025)
by: Simic, Christopher, et al.
Published: (2025)
Index-ASR Technical Report
by: Song, Zheshu, et al.
Published: (2025)
by: Song, Zheshu, et al.
Published: (2025)
Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper
by: Thorbecke, Iuliia, et al.
Published: (2024)
by: Thorbecke, Iuliia, et al.
Published: (2024)
SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
by: Fan, Zhiyun, et al.
Published: (2024)
by: Fan, Zhiyun, et al.
Published: (2024)
Speech Recognition on TV Series with Video-guided Post-ASR Correction
by: Yang, Haoyuan, et al.
Published: (2025)
by: Yang, Haoyuan, et al.
Published: (2025)
SpecTokenizer: A Lightweight Streaming Codec in the Compressed Spectrum Domain
by: Wan, Zixiang, et al.
Published: (2025)
by: Wan, Zixiang, et al.
Published: (2025)
SOA: Reducing Domain Mismatch in SSL Pipeline by Speech Only Adaptation for Low Resource ASR
by: Shankar, Natarajan Balaji, et al.
Published: (2024)
by: Shankar, Natarajan Balaji, et al.
Published: (2024)
Distilling Spectrograms into Tokens: Fast and Lightweight Bioacoustic Classification for BirdCLEF+ 2025
by: Miyaguchi, Anthony, et al.
Published: (2025)
by: Miyaguchi, Anthony, et al.
Published: (2025)
Lightweight Resolution-Aware Audio Deepfake Detection via Cross-Scale Attention and Consistency Learning
by: Shahriar, K. A.
Published: (2026)
by: Shahriar, K. A.
Published: (2026)
Towards Rehearsal-Free Multilingual ASR: A LoRA-based Case Study on Whisper
by: Xu, Tianyi, et al.
Published: (2024)
by: Xu, Tianyi, et al.
Published: (2024)
LUPET: Incorporating Hierarchical Information Path into Multilingual ASR
by: Liu, Wei, et al.
Published: (2024)
by: Liu, Wei, et al.
Published: (2024)
Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning
by: Tsai, Fang-Duo, et al.
Published: (2024)
by: Tsai, Fang-Duo, et al.
Published: (2024)
Monaural speech enhancement on drone via Adapter based transfer learning
by: Chen, Xingyu, et al.
Published: (2024)
by: Chen, Xingyu, et al.
Published: (2024)
Speaker Targeting via Self-Speaker Adaptation for Multi-talker ASR
by: Wang, Weiqing, et al.
Published: (2025)
by: Wang, Weiqing, et al.
Published: (2025)
Efficient Rehearsal for Continual Learning in ASR via Singular Value Tuning
by: Eeckt, Steven Vander, et al.
Published: (2026)
by: Eeckt, Steven Vander, et al.
Published: (2026)
Target Speaker ASR with Whisper
by: Polok, Alexander, et al.
Published: (2024)
by: Polok, Alexander, et al.
Published: (2024)
A Language-Agnostic Hierarchical LoRA-MoE Architecture for CTC-based Multilingual ASR
by: Zheng, Yuang, et al.
Published: (2026)
by: Zheng, Yuang, et al.
Published: (2026)
kNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels
by: Zhou, Jiaming, et al.
Published: (2023)
by: Zhou, Jiaming, et al.
Published: (2023)
Towards Decoupling Frontend Enhancement and Backend Recognition in Monaural Robust ASR
by: Yang, Yufeng, et al.
Published: (2024)
by: Yang, Yufeng, et al.
Published: (2024)
Efficient Scaling for LLM-based ASR
by: Mu, Bingshen, et al.
Published: (2025)
by: Mu, Bingshen, et al.
Published: (2025)
Speech Emotion Recognition with ASR Integration
by: Li, Yuanchao
Published: (2026)
by: Li, Yuanchao
Published: (2026)
Speech Denoising with Auditory Models
by: Saddler, Mark R., et al.
Published: (2020)
by: Saddler, Mark R., et al.
Published: (2020)
Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses
by: Yang, Yufeng, et al.
Published: (2025)
by: Yang, Yufeng, et al.
Published: (2025)
PromptASR for contextualized ASR with controllable style
by: Yang, Xiaoyu, et al.
Published: (2023)
by: Yang, Xiaoyu, et al.
Published: (2023)
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
by: Kang, Wei, et al.
Published: (2023)
by: Kang, Wei, et al.
Published: (2023)
SpecASR: Accelerating LLM-based Automatic Speech Recognition via Speculative Decoding
by: Wei, Linye, et al.
Published: (2025)
by: Wei, Linye, et al.
Published: (2025)
Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASR
by: Bai, Junwen, et al.
Published: (2024)
by: Bai, Junwen, et al.
Published: (2024)
Thinking in cocktail party: Chain-of-Thought and reinforcement learning for target speaker automatic speech recognition
by: Zhang, Yiru, et al.
Published: (2025)
by: Zhang, Yiru, et al.
Published: (2025)
Technical Report: A Practical Guide to Kaldi ASR Optimization
by: Hong, Mengze, et al.
Published: (2025)
by: Hong, Mengze, et al.
Published: (2025)
Elevating Robust Multi-Talker ASR by Decoupling Speaker Separation and Speech Recognition
by: Yang, Yufeng, et al.
Published: (2025)
by: Yang, Yufeng, et al.
Published: (2025)
Similar Items
-
A Unified Denoising and Adaptation Framework for Self-Supervised Bengali Dialectal ASR
by: Biswas, Swadhin, et al.
Published: (2025) -
SAML: Speaker Adaptive Mixture of LoRA Experts for End-to-End ASR
by: Zhao, Qiuming, et al.
Published: (2024) -
Lightweight Target-Speaker-Based Overlap Transcription for Practical Streaming ASR
by: Pražák, Aleš, et al.
Published: (2025) -
VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech
by: Kim, Heeseung, et al.
Published: (2024) -
Efficient Multilingual ASR Finetuning via LoRA Language Experts
by: Li, Jiahong, et al.
Published: (2025)