Saved in:
| Main Authors: | Xiao, Yang, Mahmudi, Aso, Thieberger, Nick, Ambikairajah, Eliathamby, Holden, Eun-Jung, Dang, Ting |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.06310 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Adapting Where It Matters: Depth-Aware Adaptation for Efficient Multilingual Speech Recognition in Low-Resource Languages
by: Xiao, Yang, et al.
Published: (2026)
by: Xiao, Yang, et al.
Published: (2026)
Rethinking Continual Learning for Speech and Audio: A Representation-Centric Taxonomy and Open Problems
by: Xiao, Yang, et al.
Published: (2026)
by: Xiao, Yang, et al.
Published: (2026)
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions
by: Meng, Hanyu, et al.
Published: (2024)
by: Meng, Hanyu, et al.
Published: (2024)
An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
by: Zhang, Qiquan, et al.
Published: (2024)
by: Zhang, Qiquan, et al.
Published: (2024)
Should Audio Front-ends be Adaptive? Comparing Learnable and Adaptive Front-ends
by: Zhang, Qiquan, et al.
Published: (2025)
by: Zhang, Qiquan, et al.
Published: (2025)
Mamba in Speech: Towards an Alternative to Self-Attention
by: Zhang, Xiangyu, et al.
Published: (2024)
by: Zhang, Xiangyu, et al.
Published: (2024)
Binaural Selective Attention Model for Target Speaker Extraction
by: Meng, Hanyu, et al.
Published: (2024)
by: Meng, Hanyu, et al.
Published: (2024)
Adaptive Per-Channel Energy Normalization Front-end for Robust Audio Signal Processing
by: Meng, Hanyu, et al.
Published: (2025)
by: Meng, Hanyu, et al.
Published: (2025)
Transliterated Zero-Shot Domain Adaptation for Automatic Speech Recognition
by: Zhu, Han, et al.
Published: (2024)
by: Zhu, Han, et al.
Published: (2024)
Why Can't They Remember? Uncovering Representation and Retrieval Bottlenecks in Multi-Turn Acoustic Memory
by: Xiao, Yang, et al.
Published: (2026)
by: Xiao, Yang, et al.
Published: (2026)
End-to-End Transformer-based Automatic Speech Recognition for Northern Kurdish: A Pioneering Approach
by: Abdullah, Abdulhady Abas, et al.
Published: (2024)
by: Abdullah, Abdulhady Abas, et al.
Published: (2024)
Weight Factorization and Centralization for Continual Learning in Speech Recognition
by: Ugan, Enes Yavuz, et al.
Published: (2025)
by: Ugan, Enes Yavuz, et al.
Published: (2025)
LI-TTA: Language Informed Test-Time Adaptation for Automatic Speech Recognition
by: Yoon, Eunseop, et al.
Published: (2024)
by: Yoon, Eunseop, et al.
Published: (2024)
Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation
by: Wang, Shiyao, et al.
Published: (2024)
by: Wang, Shiyao, et al.
Published: (2024)
Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition
by: Shu, Yuchun, et al.
Published: (2024)
by: Shu, Yuchun, et al.
Published: (2024)
Characterization of Speech Similarity Between Australian Aboriginal and High-Resource Languages: A Case Study on Dharawal
by: Dang, Ting, et al.
Published: (2025)
by: Dang, Ting, et al.
Published: (2025)
Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting
by: Kashiwagi, Yosuke, et al.
Published: (2024)
by: Kashiwagi, Yosuke, et al.
Published: (2024)
In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties
by: Roll, Nathan, et al.
Published: (2025)
by: Roll, Nathan, et al.
Published: (2025)
An Effective Context-Balanced Adaptation Approach for Long-Tailed Speech Recognition
by: Wang, Yi-Cheng, et al.
Published: (2024)
by: Wang, Yi-Cheng, et al.
Published: (2024)
Continual Speech Learning with Fused Speech Features
by: Wang, Guitao, et al.
Published: (2025)
by: Wang, Guitao, et al.
Published: (2025)
Blind Estimation of Sub-band Acoustic Parameters from Ambisonics Recordings using Spectro-Spatial Covariance Features
by: Meng, Hanyu, et al.
Published: (2024)
by: Meng, Hanyu, et al.
Published: (2024)
Test-Time Adaptation for Speech Emotion Recognition
by: Dong, Jiaheng, et al.
Published: (2026)
by: Dong, Jiaheng, et al.
Published: (2026)
Dynamic Data Pruning for Automatic Speech Recognition
by: Xiao, Qiao, et al.
Published: (2024)
by: Xiao, Qiao, et al.
Published: (2024)
Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
by: Lin, Guan-Ting, et al.
Published: (2024)
by: Lin, Guan-Ting, et al.
Published: (2024)
On the Problem of Text-To-Speech Model Selection for Synthetic Data Generation in Automatic Speech Recognition
by: Rossenbach, Nick, et al.
Published: (2024)
by: Rossenbach, Nick, et al.
Published: (2024)
Automatic Speech Recognition for Hindi
by: Saha, Anish, et al.
Published: (2024)
by: Saha, Anish, et al.
Published: (2024)
Universal Robust Speech Adaptation for Cross-Domain Speech Recognition and Enhancement
by: Wang, Chien-Chun, et al.
Published: (2026)
by: Wang, Chien-Chun, et al.
Published: (2026)
Continuous Speech Tokenizer in Text To Speech
by: Li, Yixing, et al.
Published: (2024)
by: Li, Yixing, et al.
Published: (2024)
Speech Recognition Rescoring with Large Speech-Text Foundation Models
by: Shivakumar, Prashanth Gurunath, et al.
Published: (2024)
by: Shivakumar, Prashanth Gurunath, et al.
Published: (2024)
Towards Unsupervised Speech Recognition Without Pronunciation Models
by: Ni, Junrui, et al.
Published: (2024)
by: Ni, Junrui, et al.
Published: (2024)
Contextualized Automatic Speech Recognition with Dynamic Vocabulary Prediction and Activation
by: Lin, Zhennan, et al.
Published: (2025)
by: Lin, Zhennan, et al.
Published: (2025)
Joint Automatic Speech Recognition And Structure Learning For Better Speech Understanding
by: Hu, Jiliang, et al.
Published: (2025)
by: Hu, Jiliang, et al.
Published: (2025)
On the Effect of Purely Synthetic Training Data for Different Automatic Speech Recognition Architectures
by: Hilmes, Benedikt, et al.
Published: (2024)
by: Hilmes, Benedikt, et al.
Published: (2024)
Swedish Whispers; Leveraging a Massive Speech Corpus for Swedish Speech Recognition
by: Vesterbacka, Leonora, et al.
Published: (2025)
by: Vesterbacka, Leonora, et al.
Published: (2025)
Inappropriate Pause Detection In Dysarthric Speech Using Large-Scale Speech Recognition
by: Lee, Jeehyun, et al.
Published: (2024)
by: Lee, Jeehyun, et al.
Published: (2024)
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
by: Wang, Yujin, et al.
Published: (2022)
by: Wang, Yujin, et al.
Published: (2022)
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition
by: Cornell, Samuele, et al.
Published: (2024)
by: Cornell, Samuele, et al.
Published: (2024)
UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error Correction
by: Guo, Jiaxin, et al.
Published: (2024)
by: Guo, Jiaxin, et al.
Published: (2024)
On the Contribution of Lexical Features to Speech Emotion Recognition
by: Combei, David
Published: (2025)
by: Combei, David
Published: (2025)
Contextualized Automatic Speech Recognition with Dynamic Vocabulary
by: Sudo, Yui, et al.
Published: (2024)
by: Sudo, Yui, et al.
Published: (2024)
Similar Items
-
Adapting Where It Matters: Depth-Aware Adaptation for Efficient Multilingual Speech Recognition in Low-Resource Languages
by: Xiao, Yang, et al.
Published: (2026) -
Rethinking Continual Learning for Speech and Audio: A Representation-Centric Taxonomy and Open Problems
by: Xiao, Yang, et al.
Published: (2026) -
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions
by: Meng, Hanyu, et al.
Published: (2024) -
An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
by: Zhang, Qiquan, et al.
Published: (2024) -
Should Audio Front-ends be Adaptive? Comparing Learnable and Adaptive Front-ends
by: Zhang, Qiquan, et al.
Published: (2025)