Saved in:
| Main Authors: | Lerbourg, Louis, Peyret, Paul, Linossier, Juliette, Malfante, Marielle |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.03412 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
NBM: an Open Dataset for the Acoustic Monitoring of Nocturnal Migratory Birds in Europe
by: Airale, Louis, et al.
Published: (2024)
by: Airale, Louis, et al.
Published: (2024)
Graph Embedding with Mel-spectrograms for Underwater Acoustic Target Recognition
by: Feng, Sheng, et al.
Published: (2025)
by: Feng, Sheng, et al.
Published: (2025)
Towards Explicit Acoustic Evidence Perception in Audio LLMs for Speech Deepfake Detection
by: Guo, Xiaoxuan, et al.
Published: (2026)
by: Guo, Xiaoxuan, et al.
Published: (2026)
No Free Lunch from Audio Pretraining in Bioacoustics: A Benchmark Study of Embeddings
by: Chen, Chenggang, et al.
Published: (2025)
by: Chen, Chenggang, et al.
Published: (2025)
AND: Audio Network Dissection for Interpreting Deep Acoustic Models
by: Wu, Tung-Yu, et al.
Published: (2024)
by: Wu, Tung-Yu, et al.
Published: (2024)
AAT: Adapting Audio Transformer for Various Acoustics Recognition Tasks
by: Liang, Yun, et al.
Published: (2024)
by: Liang, Yun, et al.
Published: (2024)
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
by: Kim, Jaeyeon, et al.
Published: (2024)
by: Kim, Jaeyeon, et al.
Published: (2024)
MAEB: Massive Audio Embedding Benchmark
by: Assadi, Adnan El, et al.
Published: (2026)
by: Assadi, Adnan El, et al.
Published: (2026)
Embedding Alignment in Code Generation for Audio
by: Kouteili, Sam, et al.
Published: (2025)
by: Kouteili, Sam, et al.
Published: (2025)
DEAF: A Benchmark for Diagnostic Evaluation of Acoustic Faithfulness in Audio Language Models
by: Xiong, Jiaqi, et al.
Published: (2026)
by: Xiong, Jiaqi, et al.
Published: (2026)
Are Audio-Language Models Listening? Audio-Specialist Heads for Adaptive Audio Steering
by: Glazer, Neta, et al.
Published: (2026)
by: Glazer, Neta, et al.
Published: (2026)
Quantum-Inspired Genetic Algorithm for Robust Source Separation in Smart City Acoustics
by: Quan, Minh K., et al.
Published: (2025)
by: Quan, Minh K., et al.
Published: (2025)
Eureka-Audio: Triggering Audio Intelligence in Compact Language Models
by: Zhang, Dan, et al.
Published: (2026)
by: Zhang, Dan, et al.
Published: (2026)
AudioMoG: Guiding Audio Generation with Mixture-of-Guidance
by: Wang, Junyou, et al.
Published: (2025)
by: Wang, Junyou, et al.
Published: (2025)
The Sonar Moment: Benchmarking Audio-Language Models in Audio Geo-Localization
by: Zhang, Ruixing, et al.
Published: (2026)
by: Zhang, Ruixing, et al.
Published: (2026)
LoSATok: Low-dimensional Semantic-Acoustic Tokenizer for Cross-Domain Audio Understanding and Generation
by: Zhang, Zhisheng, et al.
Published: (2026)
by: Zhang, Zhisheng, et al.
Published: (2026)
MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows
by: Li, Xiquan, et al.
Published: (2025)
by: Li, Xiquan, et al.
Published: (2025)
AudioMotionBench: Evaluating Auditory Motion Perception in Audio LLMs
by: Sun, Zhe, et al.
Published: (2025)
by: Sun, Zhe, et al.
Published: (2025)
Evaluating Hallucinations in Audio-Visual Multimodal LLMs with Spoken Queries under Diverse Acoustic Conditions
by: Park, Hansol, et al.
Published: (2025)
by: Park, Hansol, et al.
Published: (2025)
The Computation of Generalized Embeddings for Underwater Acoustic Target Recognition using Contrastive Learning
by: Hummel, Hilde I., et al.
Published: (2025)
by: Hummel, Hilde I., et al.
Published: (2025)
Stable Audio 3
by: Evans, Zach, et al.
Published: (2026)
by: Evans, Zach, et al.
Published: (2026)
Tuning In: Analysis of Audio Classifier Performance in Clinical Settings with Limited Data
by: Mahdi, Hamza, et al.
Published: (2024)
by: Mahdi, Hamza, et al.
Published: (2024)
AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders
by: Aparin, Georgii, et al.
Published: (2026)
by: Aparin, Georgii, et al.
Published: (2026)
Audio-Maestro: Enhancing Large Audio-Language Models with Tool-Augmented Reasoning
by: Lee, Kuan-Yi, et al.
Published: (2025)
by: Lee, Kuan-Yi, et al.
Published: (2025)
Do Joint Language-Audio Embeddings Encode Perceptual Timbre Semantics?
by: Deng, Qixin, et al.
Published: (2025)
by: Deng, Qixin, et al.
Published: (2025)
Towards Leveraging Contrastively Pretrained Neural Audio Embeddings for Recommender Tasks
by: Grötschla, Florian, et al.
Published: (2024)
by: Grötschla, Florian, et al.
Published: (2024)
AudioCapBench: Quick Evaluation on Audio Captioning across Sound, Music, and Speech
by: Qiu, Jielin, et al.
Published: (2026)
by: Qiu, Jielin, et al.
Published: (2026)
AudioGuard: Toward Comprehensive Audio Safety Protection Across Diverse Threat Models
by: Kang, Mintong, et al.
Published: (2026)
by: Kang, Mintong, et al.
Published: (2026)
HalluAudio: A Comprehensive Benchmark for Hallucination Detection in Large Audio-Language Models
by: Zhao, Feiyu, et al.
Published: (2026)
by: Zhao, Feiyu, et al.
Published: (2026)
Audio-Guided Dynamic Modality Fusion with Stereo-Aware Attention for Audio-Visual Navigation
by: Li, Jia, et al.
Published: (2025)
by: Li, Jia, et al.
Published: (2025)
Explaining Deep Learning Embeddings for Speech Emotion Recognition by Predicting Interpretable Acoustic Features
by: Dixit, Satvik, et al.
Published: (2024)
by: Dixit, Satvik, et al.
Published: (2024)
MOSS-Audio Technical Report
by: Yang, Chen, et al.
Published: (2026)
by: Yang, Chen, et al.
Published: (2026)
CTC-aligned Audio-Text Embedding for Streaming Open-vocabulary Keyword Spotting
by: Jin, Sichen, et al.
Published: (2024)
by: Jin, Sichen, et al.
Published: (2024)
Interpretable All-Type Audio Deepfake Detection with Audio LLMs via Frequency-Time Reinforcement Learning
by: Xie, Yuankun, et al.
Published: (2026)
by: Xie, Yuankun, et al.
Published: (2026)
Codec-Robust Attacks on Audio LLMs
by: Roh, Jaechul, et al.
Published: (2026)
by: Roh, Jaechul, et al.
Published: (2026)
Exploring Musical Roots: Applying Audio Embeddings to Empower Influence Attribution for a Generative Music Model
by: Barnett, Julia, et al.
Published: (2024)
by: Barnett, Julia, et al.
Published: (2024)
PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation
by: Xie, Tianxin, et al.
Published: (2025)
by: Xie, Tianxin, et al.
Published: (2025)
WoW-Bench: Evaluating Fine-Grained Acoustic Perception in Audio-Language Models via Marine Mammal Vocalizations
by: Kim, Jaeyeon, et al.
Published: (2025)
by: Kim, Jaeyeon, et al.
Published: (2025)
VidAudio-Bench: Benchmarking V2A and VT2A Generation across Four Audio Categories
by: Zhang, Qian, et al.
Published: (2026)
by: Zhang, Qian, et al.
Published: (2026)
Towards Fine-grained Temporal Perception: Post-Training Large Audio-Language Models with Audio-Side Time Prompt
by: Shi, Yanfeng, et al.
Published: (2026)
by: Shi, Yanfeng, et al.
Published: (2026)
Similar Items
-
NBM: an Open Dataset for the Acoustic Monitoring of Nocturnal Migratory Birds in Europe
by: Airale, Louis, et al.
Published: (2024) -
Graph Embedding with Mel-spectrograms for Underwater Acoustic Target Recognition
by: Feng, Sheng, et al.
Published: (2025) -
Towards Explicit Acoustic Evidence Perception in Audio LLMs for Speech Deepfake Detection
by: Guo, Xiaoxuan, et al.
Published: (2026) -
No Free Lunch from Audio Pretraining in Bioacoustics: A Benchmark Study of Embeddings
by: Chen, Chenggang, et al.
Published: (2025) -
AND: Audio Network Dissection for Interpreting Deep Acoustic Models
by: Wu, Tung-Yu, et al.
Published: (2024)