Saved in:
| Main Authors: | Ren, Zhao, Scheck, Kevin, Hou, Qinhan, van Gogh, Stefano, Wand, Michael, Schultz, Tanja |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.08021 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Machine Unlearning in Speech Emotion Recognition via Forget Set Alone
by: Ren, Zhao, et al.
Published: (2025)
by: Ren, Zhao, et al.
Published: (2025)
End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning
by: Ren, Zhao, et al.
Published: (2025)
by: Ren, Zhao, et al.
Published: (2025)
Deep Speech Synthesis from Multimodal Articulatory Representations
by: Wu, Peter, et al.
Published: (2024)
by: Wu, Peter, et al.
Published: (2024)
Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition
by: Tan, Chao, et al.
Published: (2024)
by: Tan, Chao, et al.
Published: (2024)
Speech as a Biomarker for Disease Detection
by: Botelho, Catarina, et al.
Published: (2024)
by: Botelho, Catarina, et al.
Published: (2024)
Multi-modal Speech Enhancement with Limited Electromyography Channels
by: Feng, Fuyuan, et al.
Published: (2025)
by: Feng, Fuyuan, et al.
Published: (2025)
DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models
by: wu, Weihao, et al.
Published: (2025)
by: wu, Weihao, et al.
Published: (2025)
DiffDSR: Dysarthric Speech Reconstruction Using Latent Diffusion Model
by: Chen, Xueyuan, et al.
Published: (2025)
by: Chen, Xueyuan, et al.
Published: (2025)
Breaking Resource Barriers in Speech Emotion Recognition via Data Distillation
by: Chang, Yi, et al.
Published: (2024)
by: Chang, Yi, et al.
Published: (2024)
CoDiff-VC: A Codec-Assisted Diffusion Model for Zero-shot Voice Conversion
by: Li, Yuke, et al.
Published: (2024)
by: Li, Yuke, et al.
Published: (2024)
DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation
by: Benita, Roi, et al.
Published: (2023)
by: Benita, Roi, et al.
Published: (2023)
Chain-Talker: Chain Understanding and Rendering for Empathetic Conversational Speech Synthesis
by: Hu, Yifan, et al.
Published: (2025)
by: Hu, Yifan, et al.
Published: (2025)
Affect Decoding in Phonated and Silent Speech Production from Surface EMG
by: Pistrosch, Simon, et al.
Published: (2026)
by: Pistrosch, Simon, et al.
Published: (2026)
Voice-ENHANCE: Speech Restoration using a Diffusion-based Voice Conversion Framework
by: Byun, Kyungguen, et al.
Published: (2025)
by: Byun, Kyungguen, et al.
Published: (2025)
SOVA-Bench: Benchmarking the Speech Conversation Ability for LLM-based Voice Assistant
by: Hou, Yixuan, et al.
Published: (2025)
by: Hou, Yixuan, et al.
Published: (2025)
Conditional Latent Diffusion-Based Speech Enhancement Via Dual Context Learning
by: Zhao, Shengkui, et al.
Published: (2025)
by: Zhao, Shengkui, et al.
Published: (2025)
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion
by: Li, Ruiqi, et al.
Published: (2024)
by: Li, Ruiqi, et al.
Published: (2024)
An Explainable Probabilistic Attribute Embedding Approach for Spoofed Speech Characterization
by: Chhibber, Manasi, et al.
Published: (2024)
by: Chhibber, Manasi, et al.
Published: (2024)
Freeze and Learn: Continual Learning with Selective Freezing for Speech Deepfake Detection
by: Salvi, Davide, et al.
Published: (2024)
by: Salvi, Davide, et al.
Published: (2024)
DiffAU: Diffusion-Based Ambisonics Upscaling
by: Milstein, Amit, et al.
Published: (2025)
by: Milstein, Amit, et al.
Published: (2025)
Noise-aware Speech Enhancement using Diffusion Probabilistic Model
by: Hu, Yuchen, et al.
Published: (2023)
by: Hu, Yuchen, et al.
Published: (2023)
DiffAttack: Diffusion-based Timbre-reserved Adversarial Attack in Speaker Identification
by: Wang, Qing, et al.
Published: (2025)
by: Wang, Qing, et al.
Published: (2025)
Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models
by: Kushwaha, Saksham Singh, et al.
Published: (2024)
by: Kushwaha, Saksham Singh, et al.
Published: (2024)
Generative Expressive Conversational Speech Synthesis
by: Liu, Rui, et al.
Published: (2024)
by: Liu, Rui, et al.
Published: (2024)
Variational Autoencoder for Personalized Pathological Speech Enhancement
by: Hou, Mingchi, et al.
Published: (2025)
by: Hou, Mingchi, et al.
Published: (2025)
EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion for Non-parallel and In-the-wild Data
by: Prabhu, Navin Raj, et al.
Published: (2023)
by: Prabhu, Navin Raj, et al.
Published: (2023)
REWIND: Speech Time Reversal for Enhancing Speaker Representations in Diffusion-based Voice Conversion
by: Biyani, Ishan D., et al.
Published: (2025)
by: Biyani, Ishan D., et al.
Published: (2025)
RAVE for Speech: Efficient Voice Conversion at High Sampling Rates
by: Bargum, Anders R., et al.
Published: (2024)
by: Bargum, Anders R., et al.
Published: (2024)
Source Verification for Speech Deepfakes
by: Negroni, Viola, et al.
Published: (2025)
by: Negroni, Viola, et al.
Published: (2025)
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
by: Liu, Huadai, et al.
Published: (2023)
by: Liu, Huadai, et al.
Published: (2023)
The TEA-ASLP System for Multilingual Conversational Speech Recognition and Speech Diarization in MLC-SLM 2025 Challenge
by: Xue, Hongfei, et al.
Published: (2025)
by: Xue, Hongfei, et al.
Published: (2025)
Objective and Subjective Evaluation of Diffusion-Based Speech Enhancement for Dysarthric Speech
by: de Groot, Dimme, et al.
Published: (2025)
by: de Groot, Dimme, et al.
Published: (2025)
Absorbing Discrete Diffusion for Speech Enhancement
by: Gonzalez, Philippe
Published: (2026)
by: Gonzalez, Philippe
Published: (2026)
LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models
by: Richter, Julius, et al.
Published: (2025)
by: Richter, Julius, et al.
Published: (2025)
VC-ENHANCE: Speech Restoration with Integrated Noise Suppression and Voice Conversion
by: Byun, Kyungguen, et al.
Published: (2024)
by: Byun, Kyungguen, et al.
Published: (2024)
Conformer-based Ultrasound-to-Speech Conversion
by: Ibrahimov, Ibrahim, et al.
Published: (2025)
by: Ibrahimov, Ibrahim, et al.
Published: (2025)
Dual-View Predictive Diffusion: Lightweight Speech Enhancement via Spectrogram-Image Synergy
by: Xue, Ke, et al.
Published: (2026)
by: Xue, Ke, et al.
Published: (2026)
EAD-VC: Enhancing Speech Auto-Disentanglement for Voice Conversion with IFUB Estimator and Joint Text-Guided Consistent Learning
by: Liang, Ziqi, et al.
Published: (2024)
by: Liang, Ziqi, et al.
Published: (2024)
Multitask Learning for Grapheme-to-Phoneme Conversion of Anglicisms in German Speech Recognition
by: Pritzen, Julia, et al.
Published: (2021)
by: Pritzen, Julia, et al.
Published: (2021)
dLLM-ASR: A Faster Diffusion LLM-based Framework for Speech Recognition
by: Tian, Wenjie, et al.
Published: (2026)
by: Tian, Wenjie, et al.
Published: (2026)
Similar Items
-
Machine Unlearning in Speech Emotion Recognition via Forget Set Alone
by: Ren, Zhao, et al.
Published: (2025) -
End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning
by: Ren, Zhao, et al.
Published: (2025) -
Deep Speech Synthesis from Multimodal Articulatory Representations
by: Wu, Peter, et al.
Published: (2024) -
Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition
by: Tan, Chao, et al.
Published: (2024) -
Speech as a Biomarker for Disease Detection
by: Botelho, Catarina, et al.
Published: (2024)