Saved in:
| Main Authors: | Kamahori, Keisuke, Kasai, Jungo, Kojima, Noriyuki, Kasikci, Baris |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.20583 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
VoxServe: Streaming-Centric Serving System for Speech Language Models
by: Kamahori, Keisuke, et al.
Published: (2026)
by: Kamahori, Keisuke, et al.
Published: (2026)
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge
by: Wang, He, et al.
Published: (2024)
by: Wang, He, et al.
Published: (2024)
TG-ASR: Translation-Guided Learning with Parallel Gated Cross Attention for Low-Resource Automatic Speech Recognition
by: Yang, Cheng-Yeh, et al.
Published: (2026)
by: Yang, Cheng-Yeh, et al.
Published: (2026)
Edge-ASR: Towards Low-Bit Quantization of Automatic Speech Recognition Models
by: Feng, Chen, et al.
Published: (2025)
by: Feng, Chen, et al.
Published: (2025)
Speech Recognition on TV Series with Video-guided Post-ASR Correction
by: Yang, Haoyuan, et al.
Published: (2025)
by: Yang, Haoyuan, et al.
Published: (2025)
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition
by: Hu, Shujie, et al.
Published: (2024)
by: Hu, Shujie, et al.
Published: (2024)
Adaptation and Optimization of Automatic Speech Recognition (ASR) for the Maritime Domain in the Field of VHF Communication
by: Nakilcioglu, Emin Cagatay, et al.
Published: (2023)
by: Nakilcioglu, Emin Cagatay, et al.
Published: (2023)
Enhancing Automatic Speech Recognition Through Integrated Noise Detection Architecture
by: Singh, Karamvir
Published: (2025)
by: Singh, Karamvir
Published: (2025)
Bridging ASR and LLMs for Dysarthric Speech Recognition: Benchmarking Self-Supervised and Generative Approaches
by: Aboeitta, Ahmed, et al.
Published: (2025)
by: Aboeitta, Ahmed, et al.
Published: (2025)
Unsupervised Rhythm and Voice Conversion to Improve ASR on Dysarthric Speech
by: Hajal, Karl El, et al.
Published: (2025)
by: Hajal, Karl El, et al.
Published: (2025)
Unsupervised Rhythm and Voice Conversion of Dysarthric to Healthy Speech for ASR
by: Hajal, Karl El, et al.
Published: (2025)
by: Hajal, Karl El, et al.
Published: (2025)
Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation
by: Nayeem, Md., et al.
Published: (2025)
by: Nayeem, Md., et al.
Published: (2025)
Speech Recognition-based Feature Extraction for Enhanced Automatic Severity Classification in Dysarthric Speech
by: Choi, Yerin, et al.
Published: (2024)
by: Choi, Yerin, et al.
Published: (2024)
Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
by: Shi, Hao, et al.
Published: (2024)
by: Shi, Hao, et al.
Published: (2024)
LoRP-TTS: Low-Rank Personalized Text-To-Speech
by: Bondaruk, Łukasz, et al.
Published: (2025)
by: Bondaruk, Łukasz, et al.
Published: (2025)
Do we really need Self-Attention for Streaming Automatic Speech Recognition?
by: Dkhissi, Youness, et al.
Published: (2026)
by: Dkhissi, Youness, et al.
Published: (2026)
Tiny-Align: Bridging Automatic Speech Recognition and Large Language Model on the Edge
by: Qin, Ruiyang, et al.
Published: (2024)
by: Qin, Ruiyang, et al.
Published: (2024)
ACES: Accent Subspaces for Coupling, Explanations, and Stress-Testing in Automatic Speech Recognition
by: Parekh, Swapnil
Published: (2026)
by: Parekh, Swapnil
Published: (2026)
LLMs-Integrated Automatic Hate Speech Recognition Using Controllable Text Generation Models
by: Oshima, Ryutaro, et al.
Published: (2026)
by: Oshima, Ryutaro, et al.
Published: (2026)
Probing the Information Encoded in Neural-based Acoustic Models of Automatic Speech Recognition Systems
by: Raymondaud, Quentin, et al.
Published: (2024)
by: Raymondaud, Quentin, et al.
Published: (2024)
Investigation of Whisper ASR Hallucinations Induced by Non-Speech Audio
by: Barański, Mateusz, et al.
Published: (2025)
by: Barański, Mateusz, et al.
Published: (2025)
Whisper-RIR-Mega: A Paired Clean-Reverberant Speech Benchmark for ASR Robustness to Room Acoustics
by: Goswami, Mandip
Published: (2026)
by: Goswami, Mandip
Published: (2026)
Multistage Fine-tuning Strategies for Automatic Speech Recognition in Low-resource Languages
by: Pillai, Leena G, et al.
Published: (2024)
by: Pillai, Leena G, et al.
Published: (2024)
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation
by: Zhao, Qiuming, et al.
Published: (2025)
by: Zhao, Qiuming, et al.
Published: (2025)
When De-noising Hurts: A Systematic Study of Speech Enhancement Effects on Modern Medical ASR Systems
by: Chondhekar, Sujal, et al.
Published: (2025)
by: Chondhekar, Sujal, et al.
Published: (2025)
Whisper in Medusa's Ear: Multi-head Efficient Decoding for Transformer-based ASR
by: Segal-Feldman, Yael, et al.
Published: (2024)
by: Segal-Feldman, Yael, et al.
Published: (2024)
HuBERT-VIC: Improving Noise-Robust Automatic Speech Recognition of Speech Foundation Model via Variance-Invariance-Covariance Regularization
by: Ahn, Hyebin, et al.
Published: (2025)
by: Ahn, Hyebin, et al.
Published: (2025)
Data-Efficient ASR Personalization for Non-Normative Speech Using an Uncertainty-Based Phoneme Difficulty Score for Guided Sampling
by: Pokel, Niclas, et al.
Published: (2025)
by: Pokel, Niclas, et al.
Published: (2025)
Enhanced Speech Emotion Recognition with Efficient Channel Attention Guided Deep CNN-BiLSTM Framework
by: Kundu, Niloy Kumar, et al.
Published: (2024)
by: Kundu, Niloy Kumar, et al.
Published: (2024)
Efficient Finetuning for Dimensional Speech Emotion Recognition in the Age of Transformers
by: Sampath, Aneesha, et al.
Published: (2025)
by: Sampath, Aneesha, et al.
Published: (2025)
Towards End-to-End Training of Automatic Speech Recognition for Nigerian Pidgin
by: Rufai, Amina Mardiyyah, et al.
Published: (2020)
by: Rufai, Amina Mardiyyah, et al.
Published: (2020)
Enhancing Synthetic Training Data for Speech Commands: From ASR-Based Filtering to Domain Adaptation in SSL Latent Space
by: Quintas, Sebastião, et al.
Published: (2024)
by: Quintas, Sebastião, et al.
Published: (2024)
Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM
by: Zhang, Fengrun, et al.
Published: (2024)
by: Zhang, Fengrun, et al.
Published: (2024)
CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR
by: Shao, Nian, et al.
Published: (2025)
by: Shao, Nian, et al.
Published: (2025)
GEC-RAG: Improving Generative Error Correction via Retrieval-Augmented Generation for Automatic Speech Recognition Systems
by: Robatian, Amin, et al.
Published: (2025)
by: Robatian, Amin, et al.
Published: (2025)
Toward Efficient Speech Emotion Recognition via Spectral Learning and Attention
by: Lee, HyeYoung, et al.
Published: (2025)
by: Lee, HyeYoung, et al.
Published: (2025)
Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition
by: Su, Hsuan, et al.
Published: (2024)
by: Su, Hsuan, et al.
Published: (2024)
Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey
by: Kheddar, Hamza, et al.
Published: (2024)
by: Kheddar, Hamza, et al.
Published: (2024)
Interpreting Pretrained Speech Models for Automatic Speech Assessment of Voice Disorders
by: Lau, Hok-Shing, et al.
Published: (2024)
by: Lau, Hok-Shing, et al.
Published: (2024)
Robust Cross-Etiology and Speaker-Independent Dysarthric Speech Recognition
by: Singh, Satwinder, et al.
Published: (2025)
by: Singh, Satwinder, et al.
Published: (2025)
Similar Items
-
VoxServe: Streaming-Centric Serving System for Speech Language Models
by: Kamahori, Keisuke, et al.
Published: (2026) -
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge
by: Wang, He, et al.
Published: (2024) -
TG-ASR: Translation-Guided Learning with Parallel Gated Cross Attention for Low-Resource Automatic Speech Recognition
by: Yang, Cheng-Yeh, et al.
Published: (2026) -
Edge-ASR: Towards Low-Bit Quantization of Automatic Speech Recognition Models
by: Feng, Chen, et al.
Published: (2025) -
Speech Recognition on TV Series with Video-guided Post-ASR Correction
by: Yang, Haoyuan, et al.
Published: (2025)