Saved in:
| Main Authors: | Huang, Wanting, Wang, Weiran |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.23171 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
NLE: Non-autoregressive LLM-based ASR by Transcript Editing
by: Dekel, Avihu, et al.
Published: (2026)
by: Dekel, Avihu, et al.
Published: (2026)
LV-CTC: Non-autoregressive ASR with CTC and latent variable models
by: Fujita, Yuya, et al.
Published: (2024)
by: Fujita, Yuya, et al.
Published: (2024)
A Neural Model for Contextual Biasing Score Learning and Filtering
by: Huang, Wanting, et al.
Published: (2025)
by: Huang, Wanting, et al.
Published: (2025)
Semi-supervised Learning for Code-Switching ASR with Large Language Model Filter
by: Xi, Yu, et al.
Published: (2024)
by: Xi, Yu, et al.
Published: (2024)
Consistency Based Unsupervised Self-training For ASR Personalisation
by: Zhang, Jisi, et al.
Published: (2024)
by: Zhang, Jisi, et al.
Published: (2024)
Towards Effective and Efficient Non-autoregressive decoders for Conformer and LLM-based ASR using Block-based Attention Mask
by: Wang, Tianzi, et al.
Published: (2025)
by: Wang, Tianzi, et al.
Published: (2025)
RCT: Random Consistency Training for Semi-supervised Sound Event Detection
by: Shao, Nian, et al.
Published: (2021)
by: Shao, Nian, et al.
Published: (2021)
UniEnc-CASSNAT: An Encoder-only Non-autoregressive ASR for Speech SSL Models
by: Fan, Ruchao, et al.
Published: (2024)
by: Fan, Ruchao, et al.
Published: (2024)
Inverse-Hessian Regularization for Continual Learning in ASR
by: Eeckt, Steven Vander, et al.
Published: (2026)
by: Eeckt, Steven Vander, et al.
Published: (2026)
Regularized autoregressive modeling and its application to audio signal reconstruction
by: Mokrý, Ondřej, et al.
Published: (2024)
by: Mokrý, Ondřej, et al.
Published: (2024)
Transducer Consistency Regularization for Speech to Text Applications
by: Tseng, Cindy, et al.
Published: (2024)
by: Tseng, Cindy, et al.
Published: (2024)
Reducing the Offline-Streaming Gap for Unified ASR Transducer with Consistency Regularization
by: Andrusenko, Andrei, et al.
Published: (2026)
by: Andrusenko, Andrei, et al.
Published: (2026)
Masked Self-distilled Transducer-based Keyword Spotting with Semi-autoregressive Decoding
by: Xi, Yu, et al.
Published: (2025)
by: Xi, Yu, et al.
Published: (2025)
Improving Music Source Separation with Diffusion and Consistency Refinement
by: Karchkhadze, Tornike, et al.
Published: (2024)
by: Karchkhadze, Tornike, et al.
Published: (2024)
MDM-ASR: Bridging Accuracy and Efficiency in ASR with Diffusion-Based Non-Autoregressive Decoding
by: Yen, Hao, et al.
Published: (2026)
by: Yen, Hao, et al.
Published: (2026)
Multimodal Consistency-Guided Reference-Free Data Selection for ASR Accent Adaptation
by: Lei, Ligong, et al.
Published: (2026)
by: Lei, Ligong, et al.
Published: (2026)
Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering
by: Carofilis, Andres, et al.
Published: (2025)
by: Carofilis, Andres, et al.
Published: (2025)
Enhancing Lyrics Transcription on Music Mixtures with Consistency Loss
by: Huang, Jiawen, et al.
Published: (2025)
by: Huang, Jiawen, et al.
Published: (2025)
AudioLCM: Text-to-Audio Generation with Latent Consistency Models
by: Liu, Huadai, et al.
Published: (2024)
by: Liu, Huadai, et al.
Published: (2024)
DM-ASR: Diarization-aware Multi-speaker ASR with Large Language Models
by: Li, Li, et al.
Published: (2026)
by: Li, Li, et al.
Published: (2026)
Music Consistency Models
by: Fei, Zhengcong, et al.
Published: (2024)
by: Fei, Zhengcong, et al.
Published: (2024)
InfiniteAudio: Infinite-Length Audio Generation with Consistency
by: Jung, Chaeyoung, et al.
Published: (2025)
by: Jung, Chaeyoung, et al.
Published: (2025)
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition
by: Hu, Shujie, et al.
Published: (2024)
by: Hu, Shujie, et al.
Published: (2024)
Semi-Autoregressive Streaming ASR With Label Context
by: Arora, Siddhant, et al.
Published: (2023)
by: Arora, Siddhant, et al.
Published: (2023)
DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized Representation
by: Wang, Jianzong, et al.
Published: (2023)
by: Wang, Jianzong, et al.
Published: (2023)
VS-Singer: Vision-Guided Stereo Singing Voice Synthesis with Consistency Schrödinger Bridge
by: Zhao, Zijing, et al.
Published: (2025)
by: Zhao, Zijing, et al.
Published: (2025)
A Non-autoregressive Model for Joint STT and TTS
by: Sunder, Vishal, et al.
Published: (2025)
by: Sunder, Vishal, et al.
Published: (2025)
An Investigation on Combining Geometry and Consistency Constraints into Phase Estimation for Speech Enhancement
by: Ho, Chun-Wei, et al.
Published: (2025)
by: Ho, Chun-Wei, et al.
Published: (2025)
Mel-FullSubNet: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR
by: Zhou, Rui, et al.
Published: (2024)
by: Zhou, Rui, et al.
Published: (2024)
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
by: Wagner, Dominik, et al.
Published: (2023)
by: Wagner, Dominik, et al.
Published: (2023)
A Mamba-based Network for Semi-supervised Singing Melody Extraction Using Confidence Binary Regularization
by: He, Xiaoliang, et al.
Published: (2025)
by: He, Xiaoliang, et al.
Published: (2025)
Improving ASR Contextual Biasing with Guided Attention
by: Tang, Jiyang, et al.
Published: (2024)
by: Tang, Jiyang, et al.
Published: (2024)
LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks
by: Meghanani, Amit, et al.
Published: (2024)
by: Meghanani, Amit, et al.
Published: (2024)
Benchmarking Children's ASR with Supervised and Self-supervised Speech Foundation Models
by: Fan, Ruchao, et al.
Published: (2024)
by: Fan, Ruchao, et al.
Published: (2024)
Neurodyne: Neural Pitch Manipulation with Representation Learning and Cycle-Consistency GAN
by: Gu, Yicheng, et al.
Published: (2025)
by: Gu, Yicheng, et al.
Published: (2025)
Tweaking autoregressive methods for inpainting of gaps in audio signals
by: Mokrý, Ondřej, et al.
Published: (2024)
by: Mokrý, Ondřej, et al.
Published: (2024)
All-in-One ASR: Unifying Encoder-Decoder Models of CTC, Attention, and Transducer in Dual-Mode ASR
by: Moriya, Takafumi, et al.
Published: (2025)
by: Moriya, Takafumi, et al.
Published: (2025)
Boosting Multi-Speaker Expressive Speech Synthesis with Semi-supervised Contrastive Learning
by: Zhu, Xinfa, et al.
Published: (2023)
by: Zhu, Xinfa, et al.
Published: (2023)
Schrödinger Bridge Consistency Trajectory Models for Speech Enhancement
by: Nishigori, Shuichiro, et al.
Published: (2025)
by: Nishigori, Shuichiro, et al.
Published: (2025)
Low-Cost Detection of Degraded Voice Clones via Source-Output Acoustic Consistency
by: Shokr, Jana, et al.
Published: (2026)
by: Shokr, Jana, et al.
Published: (2026)
Similar Items
-
NLE: Non-autoregressive LLM-based ASR by Transcript Editing
by: Dekel, Avihu, et al.
Published: (2026) -
LV-CTC: Non-autoregressive ASR with CTC and latent variable models
by: Fujita, Yuya, et al.
Published: (2024) -
A Neural Model for Contextual Biasing Score Learning and Filtering
by: Huang, Wanting, et al.
Published: (2025) -
Semi-supervised Learning for Code-Switching ASR with Large Language Model Filter
by: Xi, Yu, et al.
Published: (2024) -
Consistency Based Unsupervised Self-training For ASR Personalisation
by: Zhang, Jisi, et al.
Published: (2024)