Saved in:
| Main Authors: | Niu, Xinlei, Zhang, Jing, Walder, Christian, Martin, Charles Patrick |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.15338 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SoundMorpher: Perceptually-Uniform Sound Morphing with Diffusion Model
by: Niu, Xinlei, et al.
Published: (2024)
by: Niu, Xinlei, et al.
Published: (2024)
HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts
by: Niu, Xinlei, et al.
Published: (2024)
by: Niu, Xinlei, et al.
Published: (2024)
Efficient Sound Field Reconstruction with Conditional Invertible Neural Networks
by: Karakonstantis, Xenofon, et al.
Published: (2024)
by: Karakonstantis, Xenofon, et al.
Published: (2024)
Diffuse Sound Field Synthesis
by: Zotter, Franz, et al.
Published: (2024)
by: Zotter, Franz, et al.
Published: (2024)
Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
by: Niu, Xinlei, et al.
Published: (2025)
by: Niu, Xinlei, et al.
Published: (2025)
AudioSpa: Spatializing Sound Events with Text
by: Feng, Linfeng, et al.
Published: (2025)
by: Feng, Linfeng, et al.
Published: (2025)
SoundCTM: Unifying Score-based and Consistency Models for Full-band Text-to-Sound Generation
by: Saito, Koichi, et al.
Published: (2024)
by: Saito, Koichi, et al.
Published: (2024)
Learning Magnitude Distribution of Sound Fields via Conditioned Autoencoder
by: Koyama, Shoichi, et al.
Published: (2025)
by: Koyama, Shoichi, et al.
Published: (2025)
SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models
by: Yang, Dongchao, et al.
Published: (2024)
by: Yang, Dongchao, et al.
Published: (2024)
Leveraging Sound Source Trajectories for Universal Sound Separation
by: Wu, Donghang, et al.
Published: (2024)
by: Wu, Donghang, et al.
Published: (2024)
Sound Zone Control Robust To Sound Speed Change
by: Bhattacharjee, Sankha Subhra, et al.
Published: (2024)
by: Bhattacharjee, Sankha Subhra, et al.
Published: (2024)
Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models
by: Jing, Xin, et al.
Published: (2024)
by: Jing, Xin, et al.
Published: (2024)
Evaluating Sound Similarity Metrics for Differentiable, Iterative Sound-Matching
by: Salimi, Amir, et al.
Published: (2025)
by: Salimi, Amir, et al.
Published: (2025)
Leveraging Audio-Only Data for Text-Queried Target Sound Extraction
by: Saijo, Kohei, et al.
Published: (2024)
by: Saijo, Kohei, et al.
Published: (2024)
Exploring Text-Queried Sound Event Detection with Audio Source Separation
by: Yin, Han, et al.
Published: (2024)
by: Yin, Han, et al.
Published: (2024)
SoundBeam meets M2D: Target Sound Extraction with Audio Foundation Model
by: Hernandez-Olivan, Carlos, et al.
Published: (2024)
by: Hernandez-Olivan, Carlos, et al.
Published: (2024)
Diffusion based Text-to-Music Generation with Global and Local Text based Conditioning
by: Zhang, Jisi, et al.
Published: (2025)
by: Zhang, Jisi, et al.
Published: (2025)
SteerMusic: Enhanced Musical Consistency for Zero-shot Text-guided and Personalized Music Editing
by: Niu, Xinlei, et al.
Published: (2025)
by: Niu, Xinlei, et al.
Published: (2025)
Contrastive Loss Based Frame-wise Feature disentanglement for Polyphonic Sound Event Detection
by: Guan, Yadong, et al.
Published: (2024)
by: Guan, Yadong, et al.
Published: (2024)
Exploring Self-Supervised Audio Models for Generalized Anomalous Sound Detection
by: Han, Bing, et al.
Published: (2025)
by: Han, Bing, et al.
Published: (2025)
Improvements of Discriminative Feature Space Training for Anomalous Sound Detection in Unlabeled Conditions
by: Fujimura, Takuya, et al.
Published: (2024)
by: Fujimura, Takuya, et al.
Published: (2024)
Stream-based Active Learning for Anomalous Sound Detection in Machine Condition Monitoring
by: Ho, Tuan Vu, et al.
Published: (2024)
by: Ho, Tuan Vu, et al.
Published: (2024)
ASD-Diffusion: Anomalous Sound Detection with Diffusion Models
by: Zhang, Fengrun, et al.
Published: (2024)
by: Zhang, Fengrun, et al.
Published: (2024)
Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection
by: Yin, Han, et al.
Published: (2024)
by: Yin, Han, et al.
Published: (2024)
MambaFoley: Foley Sound Generation using Selective State-Space Models
by: Colombo, Marco Furio, et al.
Published: (2024)
by: Colombo, Marco Furio, et al.
Published: (2024)
Sounding Out Reconstruction Error-Based Evaluation of Generative Models of Expressive Performance
by: Peter, Silvan David, et al.
Published: (2023)
by: Peter, Silvan David, et al.
Published: (2023)
BTS: Bridging Text and Sound Modalities for Metadata-Aided Respiratory Sound Classification
by: Kim, June-Woo, et al.
Published: (2024)
by: Kim, June-Woo, et al.
Published: (2024)
A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds
by: Xu, Xuenan, et al.
Published: (2024)
by: Xu, Xuenan, et al.
Published: (2024)
Noise-Robust Sound Event Detection and Counting via Language-Queried Sound Separation
by: Chen, Yuanjian, et al.
Published: (2025)
by: Chen, Yuanjian, et al.
Published: (2025)
DiffSound: Differentiable Modal Sound Rendering and Inverse Rendering for Diverse Inference Tasks
by: Jin, Xutong, et al.
Published: (2024)
by: Jin, Xutong, et al.
Published: (2024)
SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer
by: Wang, Helin, et al.
Published: (2024)
by: Wang, Helin, et al.
Published: (2024)
AudioLCM: Text-to-Audio Generation with Latent Consistency Models
by: Liu, Huadai, et al.
Published: (2024)
by: Liu, Huadai, et al.
Published: (2024)
Retaining Mixture Representations for Domain Generalized Anomalous Sound Detection
by: Saengthong, Phurich, et al.
Published: (2025)
by: Saengthong, Phurich, et al.
Published: (2025)
A Steered Response Power Method for Sound Source Localization With Generic Acoustic Models
by: Müller, Kaspar, et al.
Published: (2025)
by: Müller, Kaspar, et al.
Published: (2025)
Fractional Fourier Sound Synthesis
by: Gutiérrez, Esteban, et al.
Published: (2025)
by: Gutiérrez, Esteban, et al.
Published: (2025)
Sound Event Bounding Boxes
by: Ebbers, Janek, et al.
Published: (2024)
by: Ebbers, Janek, et al.
Published: (2024)
Trainingless Adaptation of Pretrained Models for Environmental Sound Classification
by: Tonami, Noriyuki, et al.
Published: (2024)
by: Tonami, Noriyuki, et al.
Published: (2024)
Codec-SUPERB: An In-Depth Analysis of Sound Codec Models
by: Wu, Haibin, et al.
Published: (2024)
by: Wu, Haibin, et al.
Published: (2024)
Abstract Sound Fusion with Unconditional Inversion Models
by: Liu, Jing, et al.
Published: (2025)
by: Liu, Jing, et al.
Published: (2025)
Intelligent Text-Conditioned Music Generation
by: Xie, Zhouyao, et al.
Published: (2024)
by: Xie, Zhouyao, et al.
Published: (2024)
Similar Items
-
SoundMorpher: Perceptually-Uniform Sound Morphing with Diffusion Model
by: Niu, Xinlei, et al.
Published: (2024) -
HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts
by: Niu, Xinlei, et al.
Published: (2024) -
Efficient Sound Field Reconstruction with Conditional Invertible Neural Networks
by: Karakonstantis, Xenofon, et al.
Published: (2024) -
Diffuse Sound Field Synthesis
by: Zotter, Franz, et al.
Published: (2024) -
Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
by: Niu, Xinlei, et al.
Published: (2025)