Saved in:
| Main Authors: | Sun, Haoran, Fourer, Dominique, Maaref, Hichem |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.04337 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LAV: Audio-Driven Dynamic Visual Generation with Neural Compression and StyleGAN2
by: Jung, Jongmin, et al.
Published: (2025)
by: Jung, Jongmin, et al.
Published: (2025)
Compressing Quaternion Convolutional Neural Networks for Audio Classification
by: Singh, Arshdeep, et al.
Published: (2025)
by: Singh, Arshdeep, et al.
Published: (2025)
Apollo: Band-sequence Modeling for High-Quality Audio Restoration
by: Li, Kai, et al.
Published: (2024)
by: Li, Kai, et al.
Published: (2024)
Expressive Range Characterization of Open Text-to-Audio Models
by: Morse, Jonathan, et al.
Published: (2025)
by: Morse, Jonathan, et al.
Published: (2025)
OpenSep: Leveraging Large Language Models with Textual Inversion for Open World Audio Separation
by: Mahmud, Tanvir, et al.
Published: (2024)
by: Mahmud, Tanvir, et al.
Published: (2024)
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
by: Kim, Jaeyeon, et al.
Published: (2024)
by: Kim, Jaeyeon, et al.
Published: (2024)
Unify Variables in Neural Scaling Laws for General Audio Representations via Embedding Effective Rank
by: Deng, Xuyao, et al.
Published: (2025)
by: Deng, Xuyao, et al.
Published: (2025)
SpectroStream: A Versatile Neural Codec for General Audio
by: Li, Yunpeng, et al.
Published: (2025)
by: Li, Yunpeng, et al.
Published: (2025)
Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract Variables
by: Attia, Ahmed Adel, et al.
Published: (2023)
by: Attia, Ahmed Adel, et al.
Published: (2023)
How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?
by: Liu, Tianchi, et al.
Published: (2024)
by: Liu, Tianchi, et al.
Published: (2024)
Towards Leveraging Contrastively Pretrained Neural Audio Embeddings for Recommender Tasks
by: Grötschla, Florian, et al.
Published: (2024)
by: Grötschla, Florian, et al.
Published: (2024)
Audio Explanation Synthesis with Generative Foundation Models
by: Akman, Alican, et al.
Published: (2024)
by: Akman, Alican, et al.
Published: (2024)
Audio Mamba: Pretrained Audio State Space Model For Audio Tagging
by: Lin, Jiaju, et al.
Published: (2024)
by: Lin, Jiaju, et al.
Published: (2024)
Raw Audio Classification with Cosine Convolutional Neural Network (CosCovNN)
by: Haque, Kazi Nazmul, et al.
Published: (2024)
by: Haque, Kazi Nazmul, et al.
Published: (2024)
LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging
by: Singh, Shubhr, et al.
Published: (2025)
by: Singh, Shubhr, et al.
Published: (2025)
Enhancing Partially Spoofed Audio Localization with Boundary-aware Attention Mechanism
by: Zhong, Jiafeng, et al.
Published: (2024)
by: Zhong, Jiafeng, et al.
Published: (2024)
FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual Fusion
by: Chen, Shunian, et al.
Published: (2025)
by: Chen, Shunian, et al.
Published: (2025)
Neural Style Transfer for Audio Spectograms
by: Verma, Prateek, et al.
Published: (2018)
by: Verma, Prateek, et al.
Published: (2018)
Audio Atlas: Visualizing and Exploring Audio Datasets
by: Lanzendörfer, Luca A., et al.
Published: (2024)
by: Lanzendörfer, Luca A., et al.
Published: (2024)
SUBARU: A Practical Approach to Power Saving in Hearables Using SUB-Nyquist Audio Resolution Upsampling
by: Tamiti, Tarikul Islam, et al.
Published: (2025)
by: Tamiti, Tarikul Islam, et al.
Published: (2025)
Beyond Silence: Bias Analysis through Loss and Asymmetric Approach in Audio Anti-Spoofing
by: Shim, Hye-jin, et al.
Published: (2024)
by: Shim, Hye-jin, et al.
Published: (2024)
4,500 Seconds: Small Data Training Approaches for Deep UAV Audio Classification
by: Berg, Andrew P., et al.
Published: (2025)
by: Berg, Andrew P., et al.
Published: (2025)
Representation-Regularized Convolutional Audio Transformer for Audio Understanding
by: Han, Bing, et al.
Published: (2026)
by: Han, Bing, et al.
Published: (2026)
Audio Spatially-Guided Fusion for Audio-Visual Navigation
by: Zhou, Xinyu, et al.
Published: (2026)
by: Zhou, Xinyu, et al.
Published: (2026)
Audio-Conditioned Diffusion LLMs for ASR and Deliberation Processing
by: Wang, Mengqi, et al.
Published: (2025)
by: Wang, Mengqi, et al.
Published: (2025)
Abstract Sound Fusion with Unconditional Inversion Models
by: Liu, Jing, et al.
Published: (2025)
by: Liu, Jing, et al.
Published: (2025)
Audio-to-Image Encoding for Improved Voice Characteristic Detection Using Deep Convolutional Neural Networks
by: Atif, Youness
Published: (2025)
by: Atif, Youness
Published: (2025)
UltraEval-Audio: A Unified Framework for Comprehensive Evaluation of Audio Foundation Models
by: Shi, Qundong, et al.
Published: (2026)
by: Shi, Qundong, et al.
Published: (2026)
Enhancing Retrieval-Augmented Audio Captioning with Generation-Assisted Multimodal Querying and Progressive Learning
by: Changin, Choi, et al.
Published: (2024)
by: Changin, Choi, et al.
Published: (2024)
CMDAR: A Chinese Multi-scene Dynamic Audio Reasoning Benchmark with Diverse Challenges
by: Li, Hui, et al.
Published: (2025)
by: Li, Hui, et al.
Published: (2025)
AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion
by: Zhao, Junqi, et al.
Published: (2025)
by: Zhao, Junqi, et al.
Published: (2025)
DreamAudio: Customized Text-to-Audio Generation with Diffusion Models
by: Yuan, Yi, et al.
Published: (2025)
by: Yuan, Yi, et al.
Published: (2025)
Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation
by: Xiong, Chenxu, et al.
Published: (2024)
by: Xiong, Chenxu, et al.
Published: (2024)
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning
by: Erol, Mehmet Hamza, et al.
Published: (2024)
by: Erol, Mehmet Hamza, et al.
Published: (2024)
Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations
by: Yadav, Sarthak, et al.
Published: (2024)
by: Yadav, Sarthak, et al.
Published: (2024)
AudioScene: Integrating Object-Event Audio into 3D Scenes
by: Yuan, Shuaihang, et al.
Published: (2025)
by: Yuan, Shuaihang, et al.
Published: (2025)
Efficient Autoregressive Audio Modeling via Next-Scale Prediction
by: Qiu, Kai, et al.
Published: (2024)
by: Qiu, Kai, et al.
Published: (2024)
Stable Audio Open
by: Evans, Zach, et al.
Published: (2024)
by: Evans, Zach, et al.
Published: (2024)
The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio
by: Xie, Yuankun, et al.
Published: (2024)
by: Xie, Yuankun, et al.
Published: (2024)
HH-Codec: High Compression High-fidelity Discrete Neural Codec for Spoken Language Modeling
by: Xue, Rongkun, et al.
Published: (2025)
by: Xue, Rongkun, et al.
Published: (2025)
Similar Items
-
LAV: Audio-Driven Dynamic Visual Generation with Neural Compression and StyleGAN2
by: Jung, Jongmin, et al.
Published: (2025) -
Compressing Quaternion Convolutional Neural Networks for Audio Classification
by: Singh, Arshdeep, et al.
Published: (2025) -
Apollo: Band-sequence Modeling for High-Quality Audio Restoration
by: Li, Kai, et al.
Published: (2024) -
Expressive Range Characterization of Open Text-to-Audio Models
by: Morse, Jonathan, et al.
Published: (2025) -
OpenSep: Leveraging Large Language Models with Textual Inversion for Open World Audio Separation
by: Mahmud, Tanvir, et al.
Published: (2024)