Saved in:
| Main Authors: | Nethil, Kumarmanas, Mishra, Vaibhav, Anandan, Kriti, Manohar, Kavya |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.01021 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multistage Fine-tuning Strategies for Automatic Speech Recognition in Low-resource Languages
by: Pillai, Leena G, et al.
Published: (2024)
by: Pillai, Leena G, et al.
Published: (2024)
PromptASR for contextualized ASR with controllable style
by: Yang, Xiaoyu, et al.
Published: (2023)
by: Yang, Xiaoyu, et al.
Published: (2023)
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
by: Nguyen, Thai-Binh, et al.
Published: (2024)
by: Nguyen, Thai-Binh, et al.
Published: (2024)
AutoMode-ASR: Learning to Select ASR Systems for Better Quality and Cost
by: Gündüz, Ahmet, et al.
Published: (2024)
by: Gündüz, Ahmet, et al.
Published: (2024)
ASR-EC Benchmark: Evaluating Large Language Models on Chinese ASR Error Correction
by: Wei, Victor Junqiu, et al.
Published: (2024)
by: Wei, Victor Junqiu, et al.
Published: (2024)
Romanization Encoding For Multilingual ASR
by: Ding, Wen, et al.
Published: (2024)
by: Ding, Wen, et al.
Published: (2024)
NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR
by: Xie, Yuan, et al.
Published: (2026)
by: Xie, Yuan, et al.
Published: (2026)
Promptformer: Prompted Conformer Transducer for ASR
by: Duarte-Torres, Sergio, et al.
Published: (2024)
by: Duarte-Torres, Sergio, et al.
Published: (2024)
Qwen3-ASR Technical Report
by: Shi, Xian, et al.
Published: (2026)
by: Shi, Xian, et al.
Published: (2026)
Revisiting Acoustic Features for Robust ASR
by: Shah, Muhammad A., et al.
Published: (2024)
by: Shah, Muhammad A., et al.
Published: (2024)
Advancing Airport Tower Command Recognition: Integrating Squeeze-and-Excitation and Broadcasted Residual Learning
by: Lin, Yuanxi, et al.
Published: (2024)
by: Lin, Yuanxi, et al.
Published: (2024)
Semi-Autoregressive Streaming ASR With Label Context
by: Arora, Siddhant, et al.
Published: (2023)
by: Arora, Siddhant, et al.
Published: (2023)
Exploring SSL Discrete Tokens for Multilingual ASR
by: Cui, Mingyu, et al.
Published: (2024)
by: Cui, Mingyu, et al.
Published: (2024)
Configurable Multilingual ASR with Speech Summary Representations
by: Zhu, Harrison, et al.
Published: (2024)
by: Zhu, Harrison, et al.
Published: (2024)
ManWav: The First Manchu ASR Model
by: Seo, Jean, et al.
Published: (2024)
by: Seo, Jean, et al.
Published: (2024)
Mamba for Streaming ASR Combined with Unimodal Aggregation
by: Fang, Ying, et al.
Published: (2024)
by: Fang, Ying, et al.
Published: (2024)
Unifying Diarization, Separation, and ASR with Multi-Speaker Encoder
by: Shakeel, Muhammad, et al.
Published: (2025)
by: Shakeel, Muhammad, et al.
Published: (2025)
Causal Structure Discovery for Error Diagnostics of Children's ASR
by: Singh, Vishwanath Pratap, et al.
Published: (2025)
by: Singh, Vishwanath Pratap, et al.
Published: (2025)
Performant ASR Models for Medical Entities in Accented Speech
by: Afonja, Tejumade, et al.
Published: (2024)
by: Afonja, Tejumade, et al.
Published: (2024)
Reverb: Open-Source ASR and Diarization from Rev
by: Bhandari, Nishchal, et al.
Published: (2024)
by: Bhandari, Nishchal, et al.
Published: (2024)
ASR Error Correction using Large Language Models
by: Ma, Rao, et al.
Published: (2024)
by: Ma, Rao, et al.
Published: (2024)
WER We Stand: Benchmarking Urdu ASR Models
by: Arif, Samee, et al.
Published: (2024)
by: Arif, Samee, et al.
Published: (2024)
Extending Whisper with prompt tuning to target-speaker ASR
by: Ma, Hao, et al.
Published: (2023)
by: Ma, Hao, et al.
Published: (2023)
Crossmodal ASR Error Correction with Discrete Speech Units
by: Li, Yuanchao, et al.
Published: (2024)
by: Li, Yuanchao, et al.
Published: (2024)
Advocating Character Error Rate for Multilingual ASR Evaluation
by: K, Thennal D, et al.
Published: (2024)
by: K, Thennal D, et al.
Published: (2024)
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis
by: Zhang, Yu, et al.
Published: (2023)
by: Zhang, Yu, et al.
Published: (2023)
Vedavani: A Benchmark Corpus for ASR on Vedic Sanskrit Poetry
by: Kumar, Sujeet, et al.
Published: (2025)
by: Kumar, Sujeet, et al.
Published: (2025)
Pruning as Regularization: Sensitivity-Aware One-Shot Pruning in ASR
by: Irigoyen, Julian, et al.
Published: (2025)
by: Irigoyen, Julian, et al.
Published: (2025)
ASR Benchmarking: Need for a More Representative Conversational Dataset
by: Maheshwari, Gaurav, et al.
Published: (2024)
by: Maheshwari, Gaurav, et al.
Published: (2024)
Spelling Correction through Rewriting of Non-Autoregressive ASR Lattices
by: Velikovich, Leonid, et al.
Published: (2024)
by: Velikovich, Leonid, et al.
Published: (2024)
OCR-Enhanced Multimodal ASR Can Read While Listening
by: Chen, Junli, et al.
Published: (2026)
by: Chen, Junli, et al.
Published: (2026)
ProGRes: Prompted Generative Rescoring on ASR n-Best
by: Tur, Ada Defne, et al.
Published: (2024)
by: Tur, Ada Defne, et al.
Published: (2024)
Alignment-Free Training for Transducer-based Multi-Talker ASR
by: Moriya, Takafumi, et al.
Published: (2024)
by: Moriya, Takafumi, et al.
Published: (2024)
Locality enhanced dynamic biasing and sampling strategies for contextual ASR
by: Jalal, Md Asif, et al.
Published: (2024)
by: Jalal, Md Asif, et al.
Published: (2024)
The THUEE System Description for the IARPA OpenASR21 Challenge
by: Zhao, Jing, et al.
Published: (2022)
by: Zhao, Jing, et al.
Published: (2022)
Quantizing Whisper-small: How design choices affect ASR performance
by: Söhler, Arthur, et al.
Published: (2025)
by: Söhler, Arthur, et al.
Published: (2025)
ASR-FAIRBENCH: Measuring and Benchmarking Equity Across Speech Recognition Systems
by: Rai, Anand, et al.
Published: (2025)
by: Rai, Anand, et al.
Published: (2025)
AsyncSwitch: Asynchronous Text-Speech Adaptation for Code-Switched ASR
by: Nguyen, Tuan, et al.
Published: (2025)
by: Nguyen, Tuan, et al.
Published: (2025)
ContextASR-Bench: A Massive Contextual Speech Recognition Benchmark
by: Wang, He, et al.
Published: (2025)
by: Wang, He, et al.
Published: (2025)
WhisperKit: On-device Real-time ASR with Billion-Scale Transformers
by: Orhon, Atila, et al.
Published: (2025)
by: Orhon, Atila, et al.
Published: (2025)
Similar Items
-
Multistage Fine-tuning Strategies for Automatic Speech Recognition in Low-resource Languages
by: Pillai, Leena G, et al.
Published: (2024) -
PromptASR for contextualized ASR with controllable style
by: Yang, Xiaoyu, et al.
Published: (2023) -
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
by: Nguyen, Thai-Binh, et al.
Published: (2024) -
AutoMode-ASR: Learning to Select ASR Systems for Better Quality and Cost
by: Gündüz, Ahmet, et al.
Published: (2024) -
ASR-EC Benchmark: Evaluating Large Language Models on Chinese ASR Error Correction
by: Wei, Victor Junqiu, et al.
Published: (2024)