Saved in:
| Main Authors: | Li, Chang, Chen, Zehua, Wang, Liyuan, Zhu, Jun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.17609 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PACE: Pretrained Audio Continual Learning
by: Li, Chang, et al.
Published: (2026)
by: Li, Chang, et al.
Published: (2026)
FastWave: Optimized Diffusion Model for Audio Super-Resolution
by: Kuznetsov, Nikita, et al.
Published: (2026)
by: Kuznetsov, Nikita, et al.
Published: (2026)
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
by: Chen, Zehua, et al.
Published: (2023)
by: Chen, Zehua, et al.
Published: (2023)
GMS-CAVP: Improving Audio-Video Correspondence with Multi-Scale Contrastive and Generative Pretraining
by: Mo, Shentong, et al.
Published: (2026)
by: Mo, Shentong, et al.
Published: (2026)
DiffGAP: A Lightweight Diffusion Module in Contrastive Space for Bridging Cross-Model Gap
by: Mo, Shentong, et al.
Published: (2025)
by: Mo, Shentong, et al.
Published: (2025)
VoiceBridge: General Speech Restoration with One-step Latent Bridge Models
by: Zhang, Chi, et al.
Published: (2025)
by: Zhang, Chi, et al.
Published: (2025)
Bridge-SR: Schrödinger Bridge for Efficient SR
by: Li, Chang, et al.
Published: (2025)
by: Li, Chang, et al.
Published: (2025)
Learning Interpretable Features in Audio Latent Spaces via Sparse Autoencoders
by: Paek, Nathan, et al.
Published: (2025)
by: Paek, Nathan, et al.
Published: (2025)
CyIN: Cyclic Informative Latent Space for Bridging Complete and Incomplete Multimodal Learning
by: Lin, Ronghao, et al.
Published: (2026)
by: Lin, Ronghao, et al.
Published: (2026)
A2SB: Audio-to-Audio Schrodinger Bridges
by: Kong, Zhifeng, et al.
Published: (2025)
by: Kong, Zhifeng, et al.
Published: (2025)
Music2Latent: Consistency Autoencoders for Latent Audio Compression
by: Pasini, Marco, et al.
Published: (2024)
by: Pasini, Marco, et al.
Published: (2024)
CodecSep: Prompt-Driven Universal Sound Separation on Neural Audio Codec Latents
by: Banerjee, Adhiraj, et al.
Published: (2025)
by: Banerjee, Adhiraj, et al.
Published: (2025)
Bridging The Multi-Modality Gaps of Audio, Visual and Linguistic for Speech Enhancement
by: Lin, Meng-Ping, et al.
Published: (2025)
by: Lin, Meng-Ping, et al.
Published: (2025)
Low-Resource Guidance for Controllable Latent Audio Diffusion
by: Novack, Zachary, et al.
Published: (2026)
by: Novack, Zachary, et al.
Published: (2026)
Exploring Token-Space Manipulation in Latent Audio Tokenizers
by: Paissan, Francesco, et al.
Published: (2026)
by: Paissan, Francesco, et al.
Published: (2026)
Detecting and Mitigating Insertion Hallucination in Video-to-Audio Generation
by: Chen, Liyang, et al.
Published: (2025)
by: Chen, Liyang, et al.
Published: (2025)
Learning to Upsample and Upmix Audio in the Latent Domain
by: Bralios, Dimitrios, et al.
Published: (2025)
by: Bralios, Dimitrios, et al.
Published: (2025)
Fast Timing-Conditioned Latent Audio Diffusion
by: Evans, Zach, et al.
Published: (2024)
by: Evans, Zach, et al.
Published: (2024)
Bridging the Perception Gap: A Lightweight Coarse-to-Fine Architecture for Edge Audio Systems
by: Zhang, Hengfan, et al.
Published: (2026)
by: Zhang, Hengfan, et al.
Published: (2026)
Heterogeneity-Aware Dataset Scheduling for Efficient Audio Large Language Model Training
by: Wu, Yanru, et al.
Published: (2026)
by: Wu, Yanru, et al.
Published: (2026)
AudioMoG: Guiding Audio Generation with Mixture-of-Guidance
by: Wang, Junyou, et al.
Published: (2025)
by: Wang, Junyou, et al.
Published: (2025)
Segmentwise Pruning in Audio-Language Models
by: Gibier, Marcel, et al.
Published: (2025)
by: Gibier, Marcel, et al.
Published: (2025)
Echo: Towards Advanced Audio Comprehension via Audio-Interleaved Reasoning
by: Wu, Daiqing, et al.
Published: (2026)
by: Wu, Daiqing, et al.
Published: (2026)
Re-Bottleneck: Latent Re-Structuring for Neural Audio Autoencoders
by: Bralios, Dimitrios, et al.
Published: (2025)
by: Bralios, Dimitrios, et al.
Published: (2025)
High-Resolution Speech Restoration with Latent Diffusion Model
by: Dhyani, Tushar, et al.
Published: (2024)
by: Dhyani, Tushar, et al.
Published: (2024)
Gaussian Flow Bridges for Audio Domain Transfer with Unpaired Data
by: Moliner, Eloi, et al.
Published: (2024)
by: Moliner, Eloi, et al.
Published: (2024)
How to Label Resynthesized Audio: The Dual Role of Neural Audio Codecs in Audio Deepfake Detection
by: Xiao, Yixuan, et al.
Published: (2026)
by: Xiao, Yixuan, et al.
Published: (2026)
ADNAC: Audio Denoiser using Neural Audio Codec
by: Jimon, Daniel, et al.
Published: (2025)
by: Jimon, Daniel, et al.
Published: (2025)
ImmerseDiffusion: A Generative Spatial Audio Latent Diffusion Model
by: Heydari, Mojtaba, et al.
Published: (2024)
by: Heydari, Mojtaba, et al.
Published: (2024)
Learning Audio-Visual Embeddings with Inferred Latent Interaction Graphs
by: Zeng, Donghuo, et al.
Published: (2026)
by: Zeng, Donghuo, et al.
Published: (2026)
AudioCodecBench: A Comprehensive Benchmark for Audio Codec Evaluation
by: Wang, Lu, et al.
Published: (2025)
by: Wang, Lu, et al.
Published: (2025)
TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable Platforms
by: Sui, Yueyuan, et al.
Published: (2024)
by: Sui, Yueyuan, et al.
Published: (2024)
SEE: Signal Embedding Energy for Quantifying Noise Interference in Large Audio Language Models
by: Zhang, Yuanhe, et al.
Published: (2026)
by: Zhang, Yuanhe, et al.
Published: (2026)
TADA! Tuning Audio Diffusion Models through Activation Steering
by: Staniszewski, Łukasz, et al.
Published: (2026)
by: Staniszewski, Łukasz, et al.
Published: (2026)
LiLAC: A Lightweight Latent ControlNet for Musical Audio Generation
by: Baker, Tom, et al.
Published: (2025)
by: Baker, Tom, et al.
Published: (2025)
Active Restoration of Lost Audio Signals Using Machine Learning and Latent Information
by: Cheddad, Zohra Adila, et al.
Published: (2021)
by: Cheddad, Zohra Adila, et al.
Published: (2021)
An Enhanced Audio Feature Tailored for Anomalous Sound Detection Based on Pre-trained Models
by: Zhong, Guirui, et al.
Published: (2025)
by: Zhong, Guirui, et al.
Published: (2025)
Unleashing the Power of Natural Audio Featuring Multiple Sound Sources
by: Cheng, Xize, et al.
Published: (2025)
by: Cheng, Xize, et al.
Published: (2025)
Time-Varying Audio Effect Modeling by End-to-End Adversarial Training
by: Bourdin, Yann, et al.
Published: (2025)
by: Bourdin, Yann, et al.
Published: (2025)
Latent Granular Resynthesis using Neural Audio Codecs
by: Tokui, Nao, et al.
Published: (2025)
by: Tokui, Nao, et al.
Published: (2025)
Similar Items
-
PACE: Pretrained Audio Continual Learning
by: Li, Chang, et al.
Published: (2026) -
FastWave: Optimized Diffusion Model for Audio Super-Resolution
by: Kuznetsov, Nikita, et al.
Published: (2026) -
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
by: Chen, Zehua, et al.
Published: (2023) -
GMS-CAVP: Improving Audio-Video Correspondence with Multi-Scale Contrastive and Generative Pretraining
by: Mo, Shentong, et al.
Published: (2026) -
DiffGAP: A Lightweight Diffusion Module in Contrastive Space for Bridging Cross-Model Gap
by: Mo, Shentong, et al.
Published: (2025)