Saved in:
| Main Authors: | Aloufi, Ranya, Gupta, Srishti, Shaw, Soumya, Biggio, Battista, Schönherr, Lea |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.13262 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Buffer-free Class-Incremental Learning with Out-of-Distribution Detection
by: Gupta, Srishti, et al.
Published: (2025)
by: Gupta, Srishti, et al.
Published: (2025)
Do LLM Decoders Listen Fairly? Benchmarking How Language Model Priors Shape Bias in Speech Recognition
by: Ginjala, Srishti, et al.
Published: (2026)
by: Ginjala, Srishti, et al.
Published: (2026)
AudioGuard: Toward Comprehensive Audio Safety Protection Across Diverse Threat Models
by: Kang, Mintong, et al.
Published: (2026)
by: Kang, Mintong, et al.
Published: (2026)
Are Audio-Language Models Listening? Audio-Specialist Heads for Adaptive Audio Steering
by: Glazer, Neta, et al.
Published: (2026)
by: Glazer, Neta, et al.
Published: (2026)
Eureka-Audio: Triggering Audio Intelligence in Compact Language Models
by: Zhang, Dan, et al.
Published: (2026)
by: Zhang, Dan, et al.
Published: (2026)
ClaritySpeech: Dementia Obfuscation in Speech
by: Woszczyk, Dominika, et al.
Published: (2025)
by: Woszczyk, Dominika, et al.
Published: (2025)
Prosody-Driven Privacy-Preserving Dementia Detection
by: Woszczyk, Dominika, et al.
Published: (2024)
by: Woszczyk, Dominika, et al.
Published: (2024)
Gender Fairness in Audio Deepfake Detection: Performance and Disparity Analysis
by: Fursule, Aishwarya, et al.
Published: (2026)
by: Fursule, Aishwarya, et al.
Published: (2026)
The Sonar Moment: Benchmarking Audio-Language Models in Audio Geo-Localization
by: Zhang, Ruixing, et al.
Published: (2026)
by: Zhang, Ruixing, et al.
Published: (2026)
Audio-Maestro: Enhancing Large Audio-Language Models with Tool-Augmented Reasoning
by: Lee, Kuan-Yi, et al.
Published: (2025)
by: Lee, Kuan-Yi, et al.
Published: (2025)
HalluAudio: A Comprehensive Benchmark for Hallucination Detection in Large Audio-Language Models
by: Zhao, Feiyu, et al.
Published: (2026)
by: Zhao, Feiyu, et al.
Published: (2026)
Audio Jailbreaks in Large Audio-Language Models: Taxonomy, Attack-Defense Analysis, and Cost-Aware Evaluation
by: Feng, Bo-Han, et al.
Published: (2026)
by: Feng, Bo-Han, et al.
Published: (2026)
AudioMotionBench: Evaluating Auditory Motion Perception in Audio LLMs
by: Sun, Zhe, et al.
Published: (2025)
by: Sun, Zhe, et al.
Published: (2025)
Who Can Withstand Chat-Audio Attacks? An Evaluation Benchmark for Large Audio-Language Models
by: Yang, Wanqi, et al.
Published: (2024)
by: Yang, Wanqi, et al.
Published: (2024)
PitchBench: Measuring Pitch Hearing in Audio-Language Models
by: Dujardin, Milan Liessens, et al.
Published: (2026)
by: Dujardin, Milan Liessens, et al.
Published: (2026)
AudioCapBench: Quick Evaluation on Audio Captioning across Sound, Music, and Speech
by: Qiu, Jielin, et al.
Published: (2026)
by: Qiu, Jielin, et al.
Published: (2026)
Towards Fine-grained Temporal Perception: Post-Training Large Audio-Language Models with Audio-Side Time Prompt
by: Shi, Yanfeng, et al.
Published: (2026)
by: Shi, Yanfeng, et al.
Published: (2026)
The World is Not Mono: Enabling Spatial Understanding in Large Audio-Language Models
by: You, Yuhuan, et al.
Published: (2026)
by: You, Yuhuan, et al.
Published: (2026)
Do Audio-Visual Large Language Models Really See and Hear?
by: Selvakumar, Ramaneswaran, et al.
Published: (2026)
by: Selvakumar, Ramaneswaran, et al.
Published: (2026)
OWL: Geometry-Aware Spatial Reasoning for Audio Large Language Models
by: Biswas, Subrata, et al.
Published: (2025)
by: Biswas, Subrata, et al.
Published: (2025)
CALM: Class-Conditional Sparse Attention Vectors for Large Audio-Language Models
by: Mehta, Videet, et al.
Published: (2026)
by: Mehta, Videet, et al.
Published: (2026)
Language Models as Semantic Teachers: Post-Training Alignment for Medical Audio Understanding
by: Wang, Tsai-Ning, et al.
Published: (2025)
by: Wang, Tsai-Ning, et al.
Published: (2025)
DEAF: A Benchmark for Diagnostic Evaluation of Acoustic Faithfulness in Audio Language Models
by: Xiong, Jiaqi, et al.
Published: (2026)
by: Xiong, Jiaqi, et al.
Published: (2026)
Temporal Contrastive Decoding: A Training-Free Method for Large Audio-Language Models
by: Li, Yanda, et al.
Published: (2026)
by: Li, Yanda, et al.
Published: (2026)
AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders
by: Aparin, Georgii, et al.
Published: (2026)
by: Aparin, Georgii, et al.
Published: (2026)
VocalParse: Towards Unified and Scalable Singing Voice Transcription with Large Audio Language Models
by: Chen, Yukun, et al.
Published: (2026)
by: Chen, Yukun, et al.
Published: (2026)
EMO-TTA: Improving Test-Time Adaptation of Audio-Language Models for Speech Emotion Recognition
by: Shi, Jiacheng, et al.
Published: (2025)
by: Shi, Jiacheng, et al.
Published: (2025)
AT-ADD: All-Type Audio Deepfake Detection Challenge Evaluation Plan
by: Xie, Yuankun, et al.
Published: (2026)
by: Xie, Yuankun, et al.
Published: (2026)
Investigating Safety Vulnerabilities of Large Audio-Language Models Under Speaker Emotional Variations
by: Feng, Bo-Han, et al.
Published: (2025)
by: Feng, Bo-Han, et al.
Published: (2025)
AudioCodecBench: A Comprehensive Benchmark for Audio Codec Evaluation
by: Wang, Lu, et al.
Published: (2025)
by: Wang, Lu, et al.
Published: (2025)
All That Glitters Is Not Audio: Rethinking Text Priors and Audio Reliance in Audio-Language Evaluation
by: Foo, Leonardo Haw-Yang, et al.
Published: (2026)
by: Foo, Leonardo Haw-Yang, et al.
Published: (2026)
UltraEval-Audio: A Unified Framework for Comprehensive Evaluation of Audio Foundation Models
by: Shi, Qundong, et al.
Published: (2026)
by: Shi, Qundong, et al.
Published: (2026)
Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models
by: Song, Zirui, et al.
Published: (2025)
by: Song, Zirui, et al.
Published: (2025)
TempoSyncDiff: Distilled Temporally-Consistent Diffusion for Low-Latency Audio-Driven Talking Head Generation
by: Mazumdar, Soumya, et al.
Published: (2026)
by: Mazumdar, Soumya, et al.
Published: (2026)
Evaluation of Deep Audio Representations for Hearables
by: Gröger, Fabian, et al.
Published: (2025)
by: Gröger, Fabian, et al.
Published: (2025)
Guiding Audio Editing with Audio Language Model
by: Lan, Zitong, et al.
Published: (2025)
by: Lan, Zitong, et al.
Published: (2025)
Evaluating Semantic Fragility in Text-to-Audio Generation Systems Under Controlled Prompt Perturbations
by: Wu, Jiahui
Published: (2026)
by: Wu, Jiahui
Published: (2026)
AudioMoG: Guiding Audio Generation with Mixture-of-Guidance
by: Wang, Junyou, et al.
Published: (2025)
by: Wang, Junyou, et al.
Published: (2025)
VocalAgent: Large Language Models for Vocal Health Diagnostics with Safety-Aware Evaluation
by: Kim, Yubin, et al.
Published: (2025)
by: Kim, Yubin, et al.
Published: (2025)
Breaking Audio Large Language Models by Attacking Only the Encoder: A Universal Targeted Latent-Space Audio Attack
by: Ziv, Roee, et al.
Published: (2025)
by: Ziv, Roee, et al.
Published: (2025)
Similar Items
-
Buffer-free Class-Incremental Learning with Out-of-Distribution Detection
by: Gupta, Srishti, et al.
Published: (2025) -
Do LLM Decoders Listen Fairly? Benchmarking How Language Model Priors Shape Bias in Speech Recognition
by: Ginjala, Srishti, et al.
Published: (2026) -
AudioGuard: Toward Comprehensive Audio Safety Protection Across Diverse Threat Models
by: Kang, Mintong, et al.
Published: (2026) -
Are Audio-Language Models Listening? Audio-Specialist Heads for Adaptive Audio Steering
by: Glazer, Neta, et al.
Published: (2026) -
Eureka-Audio: Triggering Audio Intelligence in Compact Language Models
by: Zhang, Dan, et al.
Published: (2026)