:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Niu, Xinlei, Zhang, Jing, Walder, Christian, Martin, Charles Patrick
Format:	Preprint
Published:	2024
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2405.15338
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SoundMorpher: Perceptually-Uniform Sound Morphing with Diffusion Model
by: Niu, Xinlei, et al.
Published: (2024)

HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts
by: Niu, Xinlei, et al.
Published: (2024)

Efficient Sound Field Reconstruction with Conditional Invertible Neural Networks
by: Karakonstantis, Xenofon, et al.
Published: (2024)

Diffuse Sound Field Synthesis
by: Zotter, Franz, et al.
Published: (2024)

Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
by: Niu, Xinlei, et al.
Published: (2025)

AudioSpa: Spatializing Sound Events with Text
by: Feng, Linfeng, et al.
Published: (2025)

SoundCTM: Unifying Score-based and Consistency Models for Full-band Text-to-Sound Generation
by: Saito, Koichi, et al.
Published: (2024)

Learning Magnitude Distribution of Sound Fields via Conditioned Autoencoder
by: Koyama, Shoichi, et al.
Published: (2025)

SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models
by: Yang, Dongchao, et al.
Published: (2024)

Leveraging Sound Source Trajectories for Universal Sound Separation
by: Wu, Donghang, et al.
Published: (2024)

Sound Zone Control Robust To Sound Speed Change
by: Bhattacharjee, Sankha Subhra, et al.
Published: (2024)

Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models
by: Jing, Xin, et al.
Published: (2024)

Evaluating Sound Similarity Metrics for Differentiable, Iterative Sound-Matching
by: Salimi, Amir, et al.
Published: (2025)

Leveraging Audio-Only Data for Text-Queried Target Sound Extraction
by: Saijo, Kohei, et al.
Published: (2024)

Exploring Text-Queried Sound Event Detection with Audio Source Separation
by: Yin, Han, et al.
Published: (2024)

SoundBeam meets M2D: Target Sound Extraction with Audio Foundation Model
by: Hernandez-Olivan, Carlos, et al.
Published: (2024)

Diffusion based Text-to-Music Generation with Global and Local Text based Conditioning
by: Zhang, Jisi, et al.
Published: (2025)

SteerMusic: Enhanced Musical Consistency for Zero-shot Text-guided and Personalized Music Editing
by: Niu, Xinlei, et al.
Published: (2025)

Contrastive Loss Based Frame-wise Feature disentanglement for Polyphonic Sound Event Detection
by: Guan, Yadong, et al.
Published: (2024)

Exploring Self-Supervised Audio Models for Generalized Anomalous Sound Detection
by: Han, Bing, et al.
Published: (2025)

Improvements of Discriminative Feature Space Training for Anomalous Sound Detection in Unlabeled Conditions
by: Fujimura, Takuya, et al.
Published: (2024)

Stream-based Active Learning for Anomalous Sound Detection in Machine Condition Monitoring
by: Ho, Tuan Vu, et al.
Published: (2024)

ASD-Diffusion: Anomalous Sound Detection with Diffusion Models
by: Zhang, Fengrun, et al.
Published: (2024)

Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection
by: Yin, Han, et al.
Published: (2024)

MambaFoley: Foley Sound Generation using Selective State-Space Models
by: Colombo, Marco Furio, et al.
Published: (2024)

Sounding Out Reconstruction Error-Based Evaluation of Generative Models of Expressive Performance
by: Peter, Silvan David, et al.
Published: (2023)

BTS: Bridging Text and Sound Modalities for Metadata-Aided Respiratory Sound Classification
by: Kim, June-Woo, et al.
Published: (2024)

A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds
by: Xu, Xuenan, et al.
Published: (2024)

Noise-Robust Sound Event Detection and Counting via Language-Queried Sound Separation
by: Chen, Yuanjian, et al.
Published: (2025)

DiffSound: Differentiable Modal Sound Rendering and Inverse Rendering for Diverse Inference Tasks
by: Jin, Xutong, et al.
Published: (2024)

SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer
by: Wang, Helin, et al.
Published: (2024)

AudioLCM: Text-to-Audio Generation with Latent Consistency Models
by: Liu, Huadai, et al.
Published: (2024)

Retaining Mixture Representations for Domain Generalized Anomalous Sound Detection
by: Saengthong, Phurich, et al.
Published: (2025)

A Steered Response Power Method for Sound Source Localization With Generic Acoustic Models
by: Müller, Kaspar, et al.
Published: (2025)

Fractional Fourier Sound Synthesis
by: Gutiérrez, Esteban, et al.
Published: (2025)

Sound Event Bounding Boxes
by: Ebbers, Janek, et al.
Published: (2024)

Trainingless Adaptation of Pretrained Models for Environmental Sound Classification
by: Tonami, Noriyuki, et al.
Published: (2024)

Codec-SUPERB: An In-Depth Analysis of Sound Codec Models
by: Wu, Haibin, et al.
Published: (2024)

Abstract Sound Fusion with Unconditional Inversion Models
by: Liu, Jing, et al.
Published: (2025)

Intelligent Text-Conditioned Music Generation
by: Xie, Zhouyao, et al.
Published: (2024)