:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Nethil, Kumarmanas, Mishra, Vaibhav, Anandan, Kriti, Manohar, Kavya
Format:	Preprint
Published:	2025
Subjects:	Audio and Speech Processing Computation and Language Sound
Online Access:	https://arxiv.org/abs/2507.01021
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Multistage Fine-tuning Strategies for Automatic Speech Recognition in Low-resource Languages
by: Pillai, Leena G, et al.
Published: (2024)

PromptASR for contextualized ASR with controllable style
by: Yang, Xiaoyu, et al.
Published: (2023)

MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
by: Nguyen, Thai-Binh, et al.
Published: (2024)

AutoMode-ASR: Learning to Select ASR Systems for Better Quality and Cost
by: Gündüz, Ahmet, et al.
Published: (2024)

ASR-EC Benchmark: Evaluating Large Language Models on Chinese ASR Error Correction
by: Wei, Victor Junqiu, et al.
Published: (2024)

Romanization Encoding For Multilingual ASR
by: Ding, Wen, et al.
Published: (2024)

NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR
by: Xie, Yuan, et al.
Published: (2026)

Promptformer: Prompted Conformer Transducer for ASR
by: Duarte-Torres, Sergio, et al.
Published: (2024)

Qwen3-ASR Technical Report
by: Shi, Xian, et al.
Published: (2026)

Revisiting Acoustic Features for Robust ASR
by: Shah, Muhammad A., et al.
Published: (2024)

Advancing Airport Tower Command Recognition: Integrating Squeeze-and-Excitation and Broadcasted Residual Learning
by: Lin, Yuanxi, et al.
Published: (2024)

Semi-Autoregressive Streaming ASR With Label Context
by: Arora, Siddhant, et al.
Published: (2023)

Exploring SSL Discrete Tokens for Multilingual ASR
by: Cui, Mingyu, et al.
Published: (2024)

Configurable Multilingual ASR with Speech Summary Representations
by: Zhu, Harrison, et al.
Published: (2024)

ManWav: The First Manchu ASR Model
by: Seo, Jean, et al.
Published: (2024)

Mamba for Streaming ASR Combined with Unimodal Aggregation
by: Fang, Ying, et al.
Published: (2024)

Unifying Diarization, Separation, and ASR with Multi-Speaker Encoder
by: Shakeel, Muhammad, et al.
Published: (2025)

Causal Structure Discovery for Error Diagnostics of Children's ASR
by: Singh, Vishwanath Pratap, et al.
Published: (2025)

Performant ASR Models for Medical Entities in Accented Speech
by: Afonja, Tejumade, et al.
Published: (2024)

Reverb: Open-Source ASR and Diarization from Rev
by: Bhandari, Nishchal, et al.
Published: (2024)

ASR Error Correction using Large Language Models
by: Ma, Rao, et al.
Published: (2024)

WER We Stand: Benchmarking Urdu ASR Models
by: Arif, Samee, et al.
Published: (2024)

Extending Whisper with prompt tuning to target-speaker ASR
by: Ma, Hao, et al.
Published: (2023)

Crossmodal ASR Error Correction with Discrete Speech Units
by: Li, Yuanchao, et al.
Published: (2024)

Advocating Character Error Rate for Multilingual ASR Evaluation
by: K, Thennal D, et al.
Published: (2024)

StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis
by: Zhang, Yu, et al.
Published: (2023)

Vedavani: A Benchmark Corpus for ASR on Vedic Sanskrit Poetry
by: Kumar, Sujeet, et al.
Published: (2025)

Pruning as Regularization: Sensitivity-Aware One-Shot Pruning in ASR
by: Irigoyen, Julian, et al.
Published: (2025)

ASR Benchmarking: Need for a More Representative Conversational Dataset
by: Maheshwari, Gaurav, et al.
Published: (2024)

Spelling Correction through Rewriting of Non-Autoregressive ASR Lattices
by: Velikovich, Leonid, et al.
Published: (2024)

OCR-Enhanced Multimodal ASR Can Read While Listening
by: Chen, Junli, et al.
Published: (2026)

ProGRes: Prompted Generative Rescoring on ASR n-Best
by: Tur, Ada Defne, et al.
Published: (2024)

Alignment-Free Training for Transducer-based Multi-Talker ASR
by: Moriya, Takafumi, et al.
Published: (2024)

Locality enhanced dynamic biasing and sampling strategies for contextual ASR
by: Jalal, Md Asif, et al.
Published: (2024)

The THUEE System Description for the IARPA OpenASR21 Challenge
by: Zhao, Jing, et al.
Published: (2022)

Quantizing Whisper-small: How design choices affect ASR performance
by: Söhler, Arthur, et al.
Published: (2025)

ASR-FAIRBENCH: Measuring and Benchmarking Equity Across Speech Recognition Systems
by: Rai, Anand, et al.
Published: (2025)

AsyncSwitch: Asynchronous Text-Speech Adaptation for Code-Switched ASR
by: Nguyen, Tuan, et al.
Published: (2025)

ContextASR-Bench: A Massive Contextual Speech Recognition Benchmark
by: Wang, He, et al.
Published: (2025)

WhisperKit: On-device Real-time ASR with Billion-Scale Transformers
by: Orhon, Atila, et al.
Published: (2025)