Saved in:
| Main Authors: | Bertolino, Gaia A., Zhang, Yuwei, Xia, Tong, Talia, Domenico, Mascolo, Cecilia |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.06542 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RA-QA: A Benchmarking System for Respiratory Audio Question Answering Under Real-World Heterogeneity
by: Bertolino, Gaia A., et al.
Published: (2026)
by: Bertolino, Gaia A., et al.
Published: (2026)
RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction
by: Zhang, Yuwei, et al.
Published: (2024)
by: Zhang, Yuwei, et al.
Published: (2024)
Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
by: Zhang, Yuwei, et al.
Published: (2024)
by: Zhang, Yuwei, et al.
Published: (2024)
Spatial Audio Question Answering and Reasoning on Dynamic Source Movements
by: Sridhar, Arvind Krishna, et al.
Published: (2026)
by: Sridhar, Arvind Krishna, et al.
Published: (2026)
ORCA: Open-ended Response Correctness Assessment for Audio Question Answering
by: Sedláček, Šimon, et al.
Published: (2025)
by: Sedláček, Šimon, et al.
Published: (2025)
SHMamba: Structured Hyperbolic State Space Model for Audio-Visual Question Answering
by: Yang, Zhe, et al.
Published: (2024)
by: Yang, Zhe, et al.
Published: (2024)
Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering
by: Li, Gang, et al.
Published: (2025)
by: Li, Gang, et al.
Published: (2025)
AQAScore: Evaluating Semantic Alignment in Text-to-Audio Generation via Audio Question Answering
by: Kuan, Chun-Yi, et al.
Published: (2026)
by: Kuan, Chun-Yi, et al.
Published: (2026)
ATIR: Towards Audio-Text Interleaved Contextual Retrieval
by: Zhao, Tong, et al.
Published: (2026)
by: Zhao, Tong, et al.
Published: (2026)
Multi-Domain Audio Question Answering Benchmark Toward Acoustic Content Reasoning
by: Yang, Chao-Han Huck, et al.
Published: (2025)
by: Yang, Chao-Han Huck, et al.
Published: (2025)
Codec-Robust Attacks on Audio LLMs
by: Roh, Jaechul, et al.
Published: (2026)
by: Roh, Jaechul, et al.
Published: (2026)
Latent-Mark: An Audio Watermark Robust to Neural Resynthesis
by: Chen, Yen-Shan, et al.
Published: (2026)
by: Chen, Yen-Shan, et al.
Published: (2026)
UniWhisper: Efficient Continual Multi-task Training for Robust Universal Audio Representation
by: Chen, Yuxuan, et al.
Published: (2026)
by: Chen, Yuxuan, et al.
Published: (2026)
AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering
by: Kuan, Chun-Yi, et al.
Published: (2026)
by: Kuan, Chun-Yi, et al.
Published: (2026)
Robust Neural Audio Fingerprinting using Music Foundation Models
by: Singh, Shubhr, et al.
Published: (2025)
by: Singh, Shubhr, et al.
Published: (2025)
Rebellion: Noise-Robust Reasoning Training for Audio Reasoning Models
by: Huang, Tiansheng, et al.
Published: (2025)
by: Huang, Tiansheng, et al.
Published: (2025)
Explainable Multi-Modal Deep Learning for Automatic Detection of Lung Diseases from Respiratory Audio Signals
by: Saky, S M Asiful Islam, et al.
Published: (2025)
by: Saky, S M Asiful Islam, et al.
Published: (2025)
Measuring the Robustness of Audio Deepfake Detectors
by: Li, Xiang, et al.
Published: (2025)
by: Li, Xiang, et al.
Published: (2025)
The Sonar Moment: Benchmarking Audio-Language Models in Audio Geo-Localization
by: Zhang, Ruixing, et al.
Published: (2026)
by: Zhang, Ruixing, et al.
Published: (2026)
Eureka-Audio: Triggering Audio Intelligence in Compact Language Models
by: Zhang, Dan, et al.
Published: (2026)
by: Zhang, Dan, et al.
Published: (2026)
Are Audio-Language Models Listening? Audio-Specialist Heads for Adaptive Audio Steering
by: Glazer, Neta, et al.
Published: (2026)
by: Glazer, Neta, et al.
Published: (2026)
Patient-Level Multimodal Question Answering from Multi-Site Auscultation Recordings
by: Wu, Fan, et al.
Published: (2026)
by: Wu, Fan, et al.
Published: (2026)
AuTAgent: A Reinforcement Learning Framework for Tool-Augmented Audio Reasoning
by: Tong, Siqian, et al.
Published: (2026)
by: Tong, Siqian, et al.
Published: (2026)
HalluAudio: A Comprehensive Benchmark for Hallucination Detection in Large Audio-Language Models
by: Zhao, Feiyu, et al.
Published: (2026)
by: Zhao, Feiyu, et al.
Published: (2026)
BanglaFake: Constructing and Evaluating a Specialized Bengali Deepfake Audio Dataset
by: Fahad, Istiaq Ahmed, et al.
Published: (2025)
by: Fahad, Istiaq Ahmed, et al.
Published: (2025)
EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs
by: Lin, Liang, et al.
Published: (2026)
by: Lin, Liang, et al.
Published: (2026)
HierCon: Hierarchical Contrastive Attention for Audio Deepfake Detection
by: Liang, Zhili Nicholas, et al.
Published: (2026)
by: Liang, Zhili Nicholas, et al.
Published: (2026)
AudioMoG: Guiding Audio Generation with Mixture-of-Guidance
by: Wang, Junyou, et al.
Published: (2025)
by: Wang, Junyou, et al.
Published: (2025)
AudioCapBench: Quick Evaluation on Audio Captioning across Sound, Music, and Speech
by: Qiu, Jielin, et al.
Published: (2026)
by: Qiu, Jielin, et al.
Published: (2026)
VidAudio-Bench: Benchmarking V2A and VT2A Generation across Four Audio Categories
by: Zhang, Qian, et al.
Published: (2026)
by: Zhang, Qian, et al.
Published: (2026)
PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation
by: Xie, Tianxin, et al.
Published: (2025)
by: Xie, Tianxin, et al.
Published: (2025)
MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows
by: Li, Xiquan, et al.
Published: (2025)
by: Li, Xiquan, et al.
Published: (2025)
AudioMotionBench: Evaluating Auditory Motion Perception in Audio LLMs
by: Sun, Zhe, et al.
Published: (2025)
by: Sun, Zhe, et al.
Published: (2025)
AQUALLM: Audio Question Answering Data Generation Using Large Language Models
by: Behera, Swarup Ranjan, et al.
Published: (2023)
by: Behera, Swarup Ranjan, et al.
Published: (2023)
MOSS-Audio Technical Report
by: Yang, Chen, et al.
Published: (2026)
by: Yang, Chen, et al.
Published: (2026)
Stable Audio 3
by: Evans, Zach, et al.
Published: (2026)
by: Evans, Zach, et al.
Published: (2026)
Audio-Maestro: Enhancing Large Audio-Language Models with Tool-Augmented Reasoning
by: Lee, Kuan-Yi, et al.
Published: (2025)
by: Lee, Kuan-Yi, et al.
Published: (2025)
AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders
by: Aparin, Georgii, et al.
Published: (2026)
by: Aparin, Georgii, et al.
Published: (2026)
AudioGuard: Toward Comprehensive Audio Safety Protection Across Diverse Threat Models
by: Kang, Mintong, et al.
Published: (2026)
by: Kang, Mintong, et al.
Published: (2026)
Audio-Guided Dynamic Modality Fusion with Stereo-Aware Attention for Audio-Visual Navigation
by: Li, Jia, et al.
Published: (2025)
by: Li, Jia, et al.
Published: (2025)
Similar Items
-
RA-QA: A Benchmarking System for Respiratory Audio Question Answering Under Real-World Heterogeneity
by: Bertolino, Gaia A., et al.
Published: (2026) -
RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction
by: Zhang, Yuwei, et al.
Published: (2024) -
Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
by: Zhang, Yuwei, et al.
Published: (2024) -
Spatial Audio Question Answering and Reasoning on Dynamic Source Movements
by: Sridhar, Arvind Krishna, et al.
Published: (2026) -
ORCA: Open-ended Response Correctness Assessment for Audio Question Answering
by: Sedláček, Šimon, et al.
Published: (2025)