Saved in:
| Main Authors: | Baoueb, Teysir, Bie, Xiaoyu, Janati, Hicham, Richard, Gael |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.15321 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
GLA-Grad++: An Improved Griffin-Lim Guided Diffusion Model for Speech Synthesis
by: Baoueb, Teysir, et al.
Published: (2025)
by: Baoueb, Teysir, et al.
Published: (2025)
Diff-TONE: Timestep Optimization for iNstrument Editing in Text-to-Music Diffusion Models
by: Baoueb, Teysir, et al.
Published: (2025)
by: Baoueb, Teysir, et al.
Published: (2025)
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
by: Liu, Haocheng, et al.
Published: (2024)
by: Liu, Haocheng, et al.
Published: (2024)
SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis
by: Baoueb, Teysir, et al.
Published: (2024)
by: Baoueb, Teysir, et al.
Published: (2024)
Diffusion Timbre Transfer Via Mutual Information Guided Inpainting
by: Lee, Ching Ho, et al.
Published: (2026)
by: Lee, Ching Ho, et al.
Published: (2026)
Désentrelacement Fréquentiel Doux pour les Codecs Audio Neuronaux
by: Giniès, Benoît, et al.
Published: (2025)
by: Giniès, Benoît, et al.
Published: (2025)
Learning Source Disentanglement in Neural Audio Codec
by: Bie, Xiaoyu, et al.
Published: (2024)
by: Bie, Xiaoyu, et al.
Published: (2024)
Polyphonia: Zero-Shot Timbre Transfer in Polyphonic Music with Acoustic-Informed Attention Calibration
by: Li, Haowen, et al.
Published: (2026)
by: Li, Haowen, et al.
Published: (2026)
Latent Diffusion Bridges for Unsupervised Musical Audio Timbre Transfer
by: Mancusi, Michele, et al.
Published: (2024)
by: Mancusi, Michele, et al.
Published: (2024)
DiffAttack: Diffusion-based Timbre-reserved Adversarial Attack in Speaker Identification
by: Wang, Qing, et al.
Published: (2025)
by: Wang, Qing, et al.
Published: (2025)
Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis
by: Kim, Tae-Woo, et al.
Published: (2022)
by: Kim, Tae-Woo, et al.
Published: (2022)
Listening to Multi-talker Conversations: Modular and End-to-end Perspectives
by: Raj, Desh
Published: (2024)
by: Raj, Desh
Published: (2024)
Timbre Difference Capturing in Anomalous Sound Detection
by: Nishida, Tomoya, et al.
Published: (2024)
by: Nishida, Tomoya, et al.
Published: (2024)
Assessing the Alignment of Audio Representations with Timbre Similarity Ratings
by: Tian, Haokun, et al.
Published: (2025)
by: Tian, Haokun, et al.
Published: (2025)
Timbre Perception, Representation, and its Neuroscientific Exploration: A Comprehensive Review
by: Zhang, Hong, et al.
Published: (2024)
by: Zhang, Hong, et al.
Published: (2024)
DIFFRENT: A Diffusion Model for Recording Environment Transfer of Speech
by: Im, Jaekwon, et al.
Published: (2024)
by: Im, Jaekwon, et al.
Published: (2024)
QvTAD: Differential Relative Attribute Learning for Voice Timbre Attribute Detection
by: Wu, Zhiyu, et al.
Published: (2025)
by: Wu, Zhiyu, et al.
Published: (2025)
Music Style Transfer with Time-Varying Inversion of Diffusion Models
by: Li, Sifei, et al.
Published: (2024)
by: Li, Sifei, et al.
Published: (2024)
End-to-end multi-channel speaker extraction and binaural speech synthesis
by: Chi, Cheng, et al.
Published: (2024)
by: Chi, Cheng, et al.
Published: (2024)
Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
by: Lin, Guan-Ting, et al.
Published: (2024)
by: Lin, Guan-Ting, et al.
Published: (2024)
StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2024)
by: Wang, Zhichao, et al.
Published: (2024)
Stutter-Solver: End-to-end Multi-lingual Dysfluency Detection
by: Zhou, Xuanru, et al.
Published: (2024)
by: Zhou, Xuanru, et al.
Published: (2024)
End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning
by: Ren, Zhao, et al.
Published: (2025)
by: Ren, Zhao, et al.
Published: (2025)
Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models
by: Kushwaha, Saksham Singh, et al.
Published: (2024)
by: Kushwaha, Saksham Singh, et al.
Published: (2024)
Is Transfer Learning Necessary for Violin Transcription?
by: Peng, Yueh-Po, et al.
Published: (2025)
by: Peng, Yueh-Po, et al.
Published: (2025)
Transfer Learning with Pseudo Multi-Label Birdcall Classification for DS@GT BirdCLEF 2024
by: Miyaguchi, Anthony, et al.
Published: (2024)
by: Miyaguchi, Anthony, et al.
Published: (2024)
MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis
by: Guan, Wenhao, et al.
Published: (2023)
by: Guan, Wenhao, et al.
Published: (2023)
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model
by: Du, Zongyang, et al.
Published: (2024)
by: Du, Zongyang, et al.
Published: (2024)
The Voice Timbre Attribute Detection 2025 Challenge Evaluation Plan
by: Sheng, Zhengyan, et al.
Published: (2025)
by: Sheng, Zhengyan, et al.
Published: (2025)
Diff-MST: Differentiable Mixing Style Transfer
by: Vanka, Soumya Sai, et al.
Published: (2024)
by: Vanka, Soumya Sai, et al.
Published: (2024)
Transferable Adversarial Attacks on Audio Deepfake Detection
by: Farooq, Muhammad Umar, et al.
Published: (2025)
by: Farooq, Muhammad Umar, et al.
Published: (2025)
Diff-MSTC: A Mixing Style Transfer Prototype for Cubase
by: Vanka, Soumya Sai, et al.
Published: (2024)
by: Vanka, Soumya Sai, et al.
Published: (2024)
Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
by: Li, Jingbei, et al.
Published: (2023)
by: Li, Jingbei, et al.
Published: (2023)
Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription
by: Cwitkowitz, Frank, et al.
Published: (2023)
by: Cwitkowitz, Frank, et al.
Published: (2023)
Zero-shot Cross-lingual Voice Transfer for TTS
by: Biadsy, Fadi, et al.
Published: (2024)
by: Biadsy, Fadi, et al.
Published: (2024)
Relative Transfer Matrix Estimator using Covariance Subtraction
by: Manamperi, Wageesha N., et al.
Published: (2025)
by: Manamperi, Wageesha N., et al.
Published: (2025)
Do Joint Language-Audio Embeddings Encode Perceptual Timbre Semantics?
by: Deng, Qixin, et al.
Published: (2025)
by: Deng, Qixin, et al.
Published: (2025)
Accent-VITS:accent transfer for end-to-end TTS
by: Ma, Linhan, et al.
Published: (2023)
by: Ma, Linhan, et al.
Published: (2023)
An Efficient End-to-End Approach to Noise Invariant Speech Features via Multi-Task Learning
by: Guimarães, Heitor R., et al.
Published: (2024)
by: Guimarães, Heitor R., et al.
Published: (2024)
End-to-End Multi-Task Learning for Adjustable Joint Noise Reduction and Hearing Loss Compensation
by: Gonzalez, Philippe, et al.
Published: (2026)
by: Gonzalez, Philippe, et al.
Published: (2026)
Similar Items
-
GLA-Grad++: An Improved Griffin-Lim Guided Diffusion Model for Speech Synthesis
by: Baoueb, Teysir, et al.
Published: (2025) -
Diff-TONE: Timestep Optimization for iNstrument Editing in Text-to-Music Diffusion Models
by: Baoueb, Teysir, et al.
Published: (2025) -
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
by: Liu, Haocheng, et al.
Published: (2024) -
SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis
by: Baoueb, Teysir, et al.
Published: (2024) -
Diffusion Timbre Transfer Via Mutual Information Guided Inpainting
by: Lee, Ching Ho, et al.
Published: (2026)