:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cheng, Gaofeng, Lu, Haitian, Yang, Chengxu, Wang, Xuyang, Li, Ta, Yan, Yonghong
Format:	Preprint
Published:	2025
Subjects:	Audio and Speech Processing Computation and Language
Online Access:	https://arxiv.org/abs/2501.00804
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

PAC: Pronunciation-Aware Contextualized Large Language Model-based Automatic Speech Recognition
by: Fu, Li, et al.
Published: (2025)

SLIDE: Integrating Speech Language Model with LLM for Spontaneous Spoken Dialogue Generation
by: Lu, Haitian, et al.
Published: (2025)

Fine-Tuning Large Multimodal Models for Automatic Pronunciation Assessment
by: Wang, Ke, et al.
Published: (2025)

Zero-shot Context Biasing with Trie-based Decoding using Synthetic Multi-Pronunciation
by: Liu, Changsong, et al.
Published: (2025)

Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss
by: Shakeel, Muhammad, et al.
Published: (2024)

Improving Neural Biasing for Contextual Speech Recognition by Early Context Injection and Text Perturbation
by: Huang, Ruizhe, et al.
Published: (2024)

Transliterated Zero-Shot Domain Adaptation for Automatic Speech Recognition
by: Zhu, Han, et al.
Published: (2024)

Lightweight Prompt Biasing for Contextualized End-to-End ASR Systems
by: Ren, Bo, et al.
Published: (2025)

Improving ASR Contextual Biasing with Guided Attention
by: Tang, Jiyang, et al.
Published: (2024)

Pronunciation Assessment with Multi-modal Large Language Models
by: Fu, Kaiqi, et al.
Published: (2024)

OWSM-Biasing: Contextualizing Open Whisper-Style Speech Models for Automatic Speech Recognition with Dynamic Vocabulary
by: Sudo, Yui, et al.
Published: (2025)

Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition
by: Xu, Hainan, et al.
Published: (2024)

Enhancing the Robustness of Contextual ASR to Varying Biasing Information Volumes Through Purified Semantic Correlation Joint Modeling
by: Gu, Yue, et al.
Published: (2025)

Towards Unsupervised Speech Recognition Without Pronunciation Models
by: Ni, Junrui, et al.
Published: (2024)

Exploring the Potential of Large Multimodal Models as Effective Alternatives for Pronunciation Assessment
by: Wang, Ke, et al.
Published: (2025)

Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
by: Sun, Guangzhi, et al.
Published: (2022)

WCTC-Biasing: Retraining-free Contextual Biasing ASR with Wildcard CTC-based Keyword Spotting and Inter-layer Biasing
by: Nakagome, Yu, et al.
Published: (2025)

UtterTune: LoRA-Based Target-Language Pronunciation Edit and Control in Multilingual Text-to-Speech
by: Kato, Shuhei
Published: (2025)

Text Injection for Neural Contextual Biasing
by: Meng, Zhong, et al.
Published: (2024)

Contextualized Automatic Speech Recognition with Dynamic Vocabulary Prediction and Activation
by: Lin, Zhennan, et al.
Published: (2025)

Segmentation-free Goodness of Pronunciation
by: Cao, Xinwei, et al.
Published: (2025)

Contextualized Automatic Speech Recognition with Dynamic Vocabulary
by: Sudo, Yui, et al.
Published: (2024)

Streaming Non-Autoregressive Model for Accent Conversion and Pronunciation Improvement
by: Nguyen, Tuan-Nam, et al.
Published: (2025)

Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation
by: Ellinas, Nikolaos, et al.
Published: (2022)

A Neural Model for Contextual Biasing Score Learning and Filtering
by: Huang, Wanting, et al.
Published: (2025)

Acquiring Pronunciation Knowledge from Transcribed Speech Audio via Multi-task Learning
by: Sun, Siqi, et al.
Published: (2024)

Unveiling Biases while Embracing Sustainability: Assessing the Dual Challenges of Automatic Speech Recognition Systems
by: Kulkarni, Ajinkya, et al.
Published: (2025)

InterBiasing: Boost Unseen Word Recognition through Biasing Intermediate Predictions
by: Nakagome, Yu, et al.
Published: (2024)

Towards Efficient and Multifaceted Computer-assisted Pronunciation Training Leveraging Hierarchical Selective State Space Model and Decoupled Cross-entropy Loss
by: Chao, Fu-An, et al.
Published: (2025)

Automatic Speech Recognition Biases in Newcastle English: an Error Analysis
by: Serditova, Dana, et al.
Published: (2025)

ConPCO: Preserving Phoneme Characteristics for Automatic Pronunciation Assessment Leveraging Contrastive Ordinal Regularization
by: Yan, Bi-Cheng, et al.
Published: (2024)

Contextualized Automatic Speech Recognition with Attention-Based Bias Phrase Boosted Beam Search
by: Sudo, Yui, et al.
Published: (2024)

MultiPA: A Multi-task Speech Pronunciation Assessment Model for Open Response Scenarios
by: Chen, Yu-Wen, et al.
Published: (2023)

Revisiting Interpolation Augmentation for Speech-to-Text Generation
by: Xu, Chen, et al.
Published: (2024)

Lost in Transcription: Identifying and Quantifying the Accuracy Biases of Automatic Speech Recognition Systems Against Disfluent Speech
by: Mujtaba, Dena, et al.
Published: (2024)

K-Function: Joint Pronunciation Transcription and Feedback for Evaluating Kids Language Function
by: Li, Shuhe, et al.
Published: (2025)

Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
by: Yang, Yifan, et al.
Published: (2025)

Transducer Consistency Regularization for Speech to Text Applications
by: Tseng, Cindy, et al.
Published: (2024)

Automatic Pronunciation Error Detection and Correction of the Holy Quran's Learners Using Deep Learning
by: Abdelfattah, Abdullah, et al.
Published: (2025)

Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
by: Wang, Siyin, et al.
Published: (2024)