Saved in:
| Main Authors: | Geng, Mengzhe, Xie, Xurong, Deng, Jiajun, Jin, Zengrui, Li, Guinan, Wang, Tianzi, Hu, Shujie, Li, Zhaoqing, Meng, Helen, Liu, Xunying |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.06310 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Structured Speaker-Deficiency Adaptation of Foundation Models for Dysarthric and Elderly Speech Recognition
by: Hu, Shujie, et al.
Published: (2024)
by: Hu, Shujie, et al.
Published: (2024)
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition
by: Hu, Shujie, et al.
Published: (2024)
by: Hu, Shujie, et al.
Published: (2024)
On-the-fly Routing for Zero-shot MoE Speaker Adaptation of Speech Foundation Models for Dysarthric Speech Recognition
by: HU, Shujie, et al.
Published: (2025)
by: HU, Shujie, et al.
Published: (2025)
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
by: Li, Guinan, et al.
Published: (2024)
by: Li, Guinan, et al.
Published: (2024)
Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
by: Jin, Zengrui, et al.
Published: (2022)
by: Jin, Zengrui, et al.
Published: (2022)
Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation
by: Wang, Huimeng, et al.
Published: (2024)
by: Wang, Huimeng, et al.
Published: (2024)
Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
by: Wang, Tianzi, et al.
Published: (2024)
by: Wang, Tianzi, et al.
Published: (2024)
Phone-purity Guided Discrete Tokens for Dysarthric Speech Recognition
by: Wang, Huimeng, et al.
Published: (2025)
by: Wang, Huimeng, et al.
Published: (2025)
Regularized Federated Learning for Privacy-Preserving Dysarthric and Elderly Speech Recognition
by: Zhong, Tao, et al.
Published: (2025)
by: Zhong, Tao, et al.
Published: (2025)
Unfolding A Few Structures for The Many: Memory-Efficient Compression of Conformer and Speech Foundation Models
by: Li, Zhaoqing, et al.
Published: (2025)
by: Li, Zhaoqing, et al.
Published: (2025)
Effective and Efficient Mixed Precision Quantization of Speech Foundation Models
by: Xu, Haoning, et al.
Published: (2025)
by: Xu, Haoning, et al.
Published: (2025)
One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model
by: Li, Zhaoqing, et al.
Published: (2024)
by: Li, Zhaoqing, et al.
Published: (2024)
MOPSA: Mixture of Prompt-Experts Based Speaker Adaptation for Elderly Speech Recognition
by: Deng, Chengxi, et al.
Published: (2025)
by: Deng, Chengxi, et al.
Published: (2025)
Towards Effective and Efficient Non-autoregressive decoders for Conformer and LLM-based ASR using Block-based Attention Mask
by: Wang, Tianzi, et al.
Published: (2025)
by: Wang, Tianzi, et al.
Published: (2025)
Towards LLM-Empowered Fine-Grained Speech Descriptors for Explainable Emotion Recognition
by: Chen, Youjun, et al.
Published: (2025)
by: Chen, Youjun, et al.
Published: (2025)
Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition
by: Xie, Xurong, et al.
Published: (2022)
by: Xie, Xurong, et al.
Published: (2022)
Perceiver-Prompt: Flexible Speaker Adaptation in Whisper for Chinese Disordered Speech Recognition
by: Jiang, Yicong, et al.
Published: (2024)
by: Jiang, Yicong, et al.
Published: (2024)
Effective and Efficient One-pass Compression of Speech Foundation Models Using Sparsity-aware Self-pinching Gates
by: Xu, Haoning, et al.
Published: (2025)
by: Xu, Haoning, et al.
Published: (2025)
Towards One-bit ASR: Extremely Low-bit Conformer Quantization Using Co-training and Stochastic Precision
by: Li, Zhaoqing, et al.
Published: (2025)
by: Li, Zhaoqing, et al.
Published: (2025)
Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
by: Cui, Mingyu, et al.
Published: (2024)
by: Cui, Mingyu, et al.
Published: (2024)
Exploring Cross-Utterance Speech Contexts for Conformer-Transducer Speech Recognition Systems
by: Cui, Mingyu, et al.
Published: (2025)
by: Cui, Mingyu, et al.
Published: (2025)
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
by: Kang, Jiawen, et al.
Published: (2024)
by: Kang, Jiawen, et al.
Published: (2024)
UNISON: A Unified Sound Generation and Editing Framework via Deep LLM Fusion
by: Li, Zhaoqing, et al.
Published: (2026)
by: Li, Zhaoqing, et al.
Published: (2026)
Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction
by: Chen, Xueyuan, et al.
Published: (2024)
by: Chen, Xueyuan, et al.
Published: (2024)
Enhancing Speaker-Independent Dysarthric Speech Severity Classification with DSSCNet and Cross-Corpus Adaptation
by: Roy, Arnab Kumar, et al.
Published: (2025)
by: Roy, Arnab Kumar, et al.
Published: (2025)
Spectral-Aware Low-Rank Adaptation for Speaker Verification
by: Li, Zhe, et al.
Published: (2025)
by: Li, Zhe, et al.
Published: (2025)
Bayesian Learning for Deep Neural Network Adaptation
by: Xie, Xurong, et al.
Published: (2020)
by: Xie, Xurong, et al.
Published: (2020)
Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation
by: Wang, Shiyao, et al.
Published: (2024)
by: Wang, Shiyao, et al.
Published: (2024)
Investigation of Deep Neural Network Acoustic Modelling Approaches for Low Resource Accented Mandarin Speech Recognition
by: Xie, Xurong, et al.
Published: (2022)
by: Xie, Xurong, et al.
Published: (2022)
Speaker-Smoothed kNN Speaker Adaptation for End-to-End ASR
by: Li, Shaojun, et al.
Published: (2024)
by: Li, Shaojun, et al.
Published: (2024)
Cross-Speaker Encoding Network for Multi-Talker Speech Recognition
by: Kang, Jiawen, et al.
Published: (2024)
by: Kang, Jiawen, et al.
Published: (2024)
Speaker Contrastive Learning for Source Speaker Tracing
by: Wang, Qing, et al.
Published: (2024)
by: Wang, Qing, et al.
Published: (2024)
SCDNet: Self-supervised Learning Feature-based Speaker Change Detection
by: Li, Yue, et al.
Published: (2024)
by: Li, Yue, et al.
Published: (2024)
Speaker Targeting via Self-Speaker Adaptation for Multi-talker ASR
by: Wang, Weiqing, et al.
Published: (2025)
by: Wang, Weiqing, et al.
Published: (2025)
Enhancing Target Speaker Extraction with Explicit Speaker Consistency Modeling
by: Wu, Shu, et al.
Published: (2025)
by: Wu, Shu, et al.
Published: (2025)
Multi-Level Speaker Representation for Target Speaker Extraction
by: Zhang, Ke, et al.
Published: (2024)
by: Zhang, Ke, et al.
Published: (2024)
An Investigation on Speaker Augmentation for End-to-End Speaker Extraction
by: You, Zhenghai, et al.
Published: (2025)
by: You, Zhenghai, et al.
Published: (2025)
ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation
by: Fu, Ruibo, et al.
Published: (2024)
by: Fu, Ruibo, et al.
Published: (2024)
Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions
by: Meng, Lingwei, et al.
Published: (2024)
by: Meng, Lingwei, et al.
Published: (2024)
A Comprehensive Investigation on Speaker Augmentation for Speaker Recognition
by: Zhou, Zhenyu, et al.
Published: (2024)
by: Zhou, Zhenyu, et al.
Published: (2024)
Similar Items
-
Structured Speaker-Deficiency Adaptation of Foundation Models for Dysarthric and Elderly Speech Recognition
by: Hu, Shujie, et al.
Published: (2024) -
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition
by: Hu, Shujie, et al.
Published: (2024) -
On-the-fly Routing for Zero-shot MoE Speaker Adaptation of Speech Foundation Models for Dysarthric Speech Recognition
by: HU, Shujie, et al.
Published: (2025) -
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
by: Li, Guinan, et al.
Published: (2024) -
Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
by: Jin, Zengrui, et al.
Published: (2022)