:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kamahori, Keisuke, Kasai, Jungo, Kojima, Noriyuki, Kasikci, Baris
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2502.20583
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

VoxServe: Streaming-Centric Serving System for Speech Language Models
by: Kamahori, Keisuke, et al.
Published: (2026)

ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge
by: Wang, He, et al.
Published: (2024)

TG-ASR: Translation-Guided Learning with Parallel Gated Cross Attention for Low-Resource Automatic Speech Recognition
by: Yang, Cheng-Yeh, et al.
Published: (2026)

Edge-ASR: Towards Low-Bit Quantization of Automatic Speech Recognition Models
by: Feng, Chen, et al.
Published: (2025)

Speech Recognition on TV Series with Video-guided Post-ASR Correction
by: Yang, Haoyuan, et al.
Published: (2025)

Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition
by: Hu, Shujie, et al.
Published: (2024)

Adaptation and Optimization of Automatic Speech Recognition (ASR) for the Maritime Domain in the Field of VHF Communication
by: Nakilcioglu, Emin Cagatay, et al.
Published: (2023)

Enhancing Automatic Speech Recognition Through Integrated Noise Detection Architecture
by: Singh, Karamvir
Published: (2025)

Bridging ASR and LLMs for Dysarthric Speech Recognition: Benchmarking Self-Supervised and Generative Approaches
by: Aboeitta, Ahmed, et al.
Published: (2025)

Unsupervised Rhythm and Voice Conversion to Improve ASR on Dysarthric Speech
by: Hajal, Karl El, et al.
Published: (2025)

Unsupervised Rhythm and Voice Conversion of Dysarthric to Healthy Speech for ASR
by: Hajal, Karl El, et al.
Published: (2025)

Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation
by: Nayeem, Md., et al.
Published: (2025)

Speech Recognition-based Feature Extraction for Enhanced Automatic Severity Classification in Dysarthric Speech
by: Choi, Yerin, et al.
Published: (2024)

Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
by: Shi, Hao, et al.
Published: (2024)

LoRP-TTS: Low-Rank Personalized Text-To-Speech
by: Bondaruk, Łukasz, et al.
Published: (2025)

Do we really need Self-Attention for Streaming Automatic Speech Recognition?
by: Dkhissi, Youness, et al.
Published: (2026)

Tiny-Align: Bridging Automatic Speech Recognition and Large Language Model on the Edge
by: Qin, Ruiyang, et al.
Published: (2024)

ACES: Accent Subspaces for Coupling, Explanations, and Stress-Testing in Automatic Speech Recognition
by: Parekh, Swapnil
Published: (2026)

LLMs-Integrated Automatic Hate Speech Recognition Using Controllable Text Generation Models
by: Oshima, Ryutaro, et al.
Published: (2026)

Probing the Information Encoded in Neural-based Acoustic Models of Automatic Speech Recognition Systems
by: Raymondaud, Quentin, et al.
Published: (2024)

Investigation of Whisper ASR Hallucinations Induced by Non-Speech Audio
by: Barański, Mateusz, et al.
Published: (2025)

Whisper-RIR-Mega: A Paired Clean-Reverberant Speech Benchmark for ASR Robustness to Room Acoustics
by: Goswami, Mandip
Published: (2026)

Multistage Fine-tuning Strategies for Automatic Speech Recognition in Low-resource Languages
by: Pillai, Leena G, et al.
Published: (2024)

Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation
by: Zhao, Qiuming, et al.
Published: (2025)

When De-noising Hurts: A Systematic Study of Speech Enhancement Effects on Modern Medical ASR Systems
by: Chondhekar, Sujal, et al.
Published: (2025)

Whisper in Medusa's Ear: Multi-head Efficient Decoding for Transformer-based ASR
by: Segal-Feldman, Yael, et al.
Published: (2024)

HuBERT-VIC: Improving Noise-Robust Automatic Speech Recognition of Speech Foundation Model via Variance-Invariance-Covariance Regularization
by: Ahn, Hyebin, et al.
Published: (2025)

Data-Efficient ASR Personalization for Non-Normative Speech Using an Uncertainty-Based Phoneme Difficulty Score for Guided Sampling
by: Pokel, Niclas, et al.
Published: (2025)

Enhanced Speech Emotion Recognition with Efficient Channel Attention Guided Deep CNN-BiLSTM Framework
by: Kundu, Niloy Kumar, et al.
Published: (2024)

Efficient Finetuning for Dimensional Speech Emotion Recognition in the Age of Transformers
by: Sampath, Aneesha, et al.
Published: (2025)

Towards End-to-End Training of Automatic Speech Recognition for Nigerian Pidgin
by: Rufai, Amina Mardiyyah, et al.
Published: (2020)

Enhancing Synthetic Training Data for Speech Commands: From ASR-Based Filtering to Domain Adaptation in SSL Latent Space
by: Quintas, Sebastião, et al.
Published: (2024)

Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM
by: Zhang, Fengrun, et al.
Published: (2024)

CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR
by: Shao, Nian, et al.
Published: (2025)

GEC-RAG: Improving Generative Error Correction via Retrieval-Augmented Generation for Automatic Speech Recognition Systems
by: Robatian, Amin, et al.
Published: (2025)

Toward Efficient Speech Emotion Recognition via Spectral Learning and Attention
by: Lee, HyeYoung, et al.
Published: (2025)

Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition
by: Su, Hsuan, et al.
Published: (2024)

Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey
by: Kheddar, Hamza, et al.
Published: (2024)

Interpreting Pretrained Speech Models for Automatic Speech Assessment of Voice Disorders
by: Lau, Hok-Shing, et al.
Published: (2024)

Robust Cross-Etiology and Speaker-Independent Dysarthric Speech Recognition
by: Singh, Satwinder, et al.
Published: (2025)