:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Baoueb, Teysir, Bie, Xiaoyu, Janati, Hicham, Richard, Gael
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing Sound
Online Access:	https://arxiv.org/abs/2409.15321
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

GLA-Grad++: An Improved Griffin-Lim Guided Diffusion Model for Speech Synthesis
by: Baoueb, Teysir, et al.
Published: (2025)

Diff-TONE: Timestep Optimization for iNstrument Editing in Text-to-Music Diffusion Models
by: Baoueb, Teysir, et al.
Published: (2025)

GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
by: Liu, Haocheng, et al.
Published: (2024)

SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis
by: Baoueb, Teysir, et al.
Published: (2024)

Diffusion Timbre Transfer Via Mutual Information Guided Inpainting
by: Lee, Ching Ho, et al.
Published: (2026)

Désentrelacement Fréquentiel Doux pour les Codecs Audio Neuronaux
by: Giniès, Benoît, et al.
Published: (2025)

Learning Source Disentanglement in Neural Audio Codec
by: Bie, Xiaoyu, et al.
Published: (2024)

Polyphonia: Zero-Shot Timbre Transfer in Polyphonic Music with Acoustic-Informed Attention Calibration
by: Li, Haowen, et al.
Published: (2026)

Latent Diffusion Bridges for Unsupervised Musical Audio Timbre Transfer
by: Mancusi, Michele, et al.
Published: (2024)

DiffAttack: Diffusion-based Timbre-reserved Adversarial Attack in Speaker Identification
by: Wang, Qing, et al.
Published: (2025)

Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis
by: Kim, Tae-Woo, et al.
Published: (2022)

Listening to Multi-talker Conversations: Modular and End-to-end Perspectives
by: Raj, Desh
Published: (2024)

Timbre Difference Capturing in Anomalous Sound Detection
by: Nishida, Tomoya, et al.
Published: (2024)

Assessing the Alignment of Audio Representations with Timbre Similarity Ratings
by: Tian, Haokun, et al.
Published: (2025)

Timbre Perception, Representation, and its Neuroscientific Exploration: A Comprehensive Review
by: Zhang, Hong, et al.
Published: (2024)

DIFFRENT: A Diffusion Model for Recording Environment Transfer of Speech
by: Im, Jaekwon, et al.
Published: (2024)

QvTAD: Differential Relative Attribute Learning for Voice Timbre Attribute Detection
by: Wu, Zhiyu, et al.
Published: (2025)

Music Style Transfer with Time-Varying Inversion of Diffusion Models
by: Li, Sifei, et al.
Published: (2024)

End-to-end multi-channel speaker extraction and binaural speech synthesis
by: Chi, Cheng, et al.
Published: (2024)

Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
by: Lin, Guan-Ting, et al.
Published: (2024)

StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2024)

Stutter-Solver: End-to-end Multi-lingual Dysfluency Detection
by: Zhou, Xuanru, et al.
Published: (2024)

End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning
by: Ren, Zhao, et al.
Published: (2025)

Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models
by: Kushwaha, Saksham Singh, et al.
Published: (2024)

Is Transfer Learning Necessary for Violin Transcription?
by: Peng, Yueh-Po, et al.
Published: (2025)

Transfer Learning with Pseudo Multi-Label Birdcall Classification for DS@GT BirdCLEF 2024
by: Miyaguchi, Anthony, et al.
Published: (2024)

MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis
by: Guan, Wenhao, et al.
Published: (2023)

Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model
by: Du, Zongyang, et al.
Published: (2024)

The Voice Timbre Attribute Detection 2025 Challenge Evaluation Plan
by: Sheng, Zhengyan, et al.
Published: (2025)

Diff-MST: Differentiable Mixing Style Transfer
by: Vanka, Soumya Sai, et al.
Published: (2024)

Transferable Adversarial Attacks on Audio Deepfake Detection
by: Farooq, Muhammad Umar, et al.
Published: (2025)

Diff-MSTC: A Mixing Style Transfer Prototype for Cubase
by: Vanka, Soumya Sai, et al.
Published: (2024)

Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
by: Li, Jingbei, et al.
Published: (2023)

Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription
by: Cwitkowitz, Frank, et al.
Published: (2023)

Zero-shot Cross-lingual Voice Transfer for TTS
by: Biadsy, Fadi, et al.
Published: (2024)

Relative Transfer Matrix Estimator using Covariance Subtraction
by: Manamperi, Wageesha N., et al.
Published: (2025)

Do Joint Language-Audio Embeddings Encode Perceptual Timbre Semantics?
by: Deng, Qixin, et al.
Published: (2025)

Accent-VITS:accent transfer for end-to-end TTS
by: Ma, Linhan, et al.
Published: (2023)

An Efficient End-to-End Approach to Noise Invariant Speech Features via Multi-Task Learning
by: Guimarães, Heitor R., et al.
Published: (2024)

End-to-End Multi-Task Learning for Adjustable Joint Noise Reduction and Hearing Loss Compensation
by: Gonzalez, Philippe, et al.
Published: (2026)