Saved in:
| Main Authors: | Yoshimura, Kosuke, Kashima, Hisashi |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.06991 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Hierarchical Text Classification Using Black Box Large Language Models
by: Yoshimura, Kosuke, et al.
Published: (2025)
by: Yoshimura, Kosuke, et al.
Published: (2025)
AaSP: Aliasing-aware Self-Supervised Pre-Training for Audio Spectrogram Transformers
by: Yamamoto, Kohei, et al.
Published: (2025)
by: Yamamoto, Kohei, et al.
Published: (2025)
Baseline Systems For The 2025 Low-Resource Audio Codec Challenge
by: Isik, Yusuf Ziya, et al.
Published: (2025)
by: Isik, Yusuf Ziya, et al.
Published: (2025)
Low-Resource Guidance for Controllable Latent Audio Diffusion
by: Novack, Zachary, et al.
Published: (2026)
by: Novack, Zachary, et al.
Published: (2026)
APEX: Audio Prototype EXplanations for Classification Tasks
by: Kawa, Piotr, et al.
Published: (2026)
by: Kawa, Piotr, et al.
Published: (2026)
Investigating Modality Contribution in Audio LLMs for Music
by: Morais, Giovana, et al.
Published: (2025)
by: Morais, Giovana, et al.
Published: (2025)
Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
by: Sundar, Anirudh S., et al.
Published: (2023)
by: Sundar, Anirudh S., et al.
Published: (2023)
Learning Interpretable Features in Audio Latent Spaces via Sparse Autoencoders
by: Paek, Nathan, et al.
Published: (2025)
by: Paek, Nathan, et al.
Published: (2025)
Training-Free Multimodal Guidance for Video to Audio Generation
by: Grassucci, Eleonora, et al.
Published: (2025)
by: Grassucci, Eleonora, et al.
Published: (2025)
Audio Classification of Low Feature Spectrograms Utilizing Convolutional Neural Networks
by: Elias, Noel
Published: (2024)
by: Elias, Noel
Published: (2024)
Improving Audio Classification by Transitioning from Zero- to Few-Shot
by: Taylor, James, et al.
Published: (2025)
by: Taylor, James, et al.
Published: (2025)
Prototypical Contrastive Learning For Improved Few-Shot Audio Classification
by: Sgouropoulos, Christos, et al.
Published: (2025)
by: Sgouropoulos, Christos, et al.
Published: (2025)
Content Adaptive Front End For Audio Classification
by: Verma, Prateek, et al.
Published: (2023)
by: Verma, Prateek, et al.
Published: (2023)
Unmute the Patch Tokens: Rethinking Probing in Multi-Label Audio Classification
by: Rauch, Lukas, et al.
Published: (2025)
by: Rauch, Lukas, et al.
Published: (2025)
Leveraging Prediction Entropy for Automatic Prompt Weighting in Zero-Shot Audio-Language Classification
by: Khoury, Karim El, et al.
Published: (2026)
by: Khoury, Karim El, et al.
Published: (2026)
Do Audio LLMs Listen or Read? Analyzing and Mitigating Paralinguistic Failures with VoxParadox
by: Pang, Jiacheng, et al.
Published: (2026)
by: Pang, Jiacheng, et al.
Published: (2026)
Knowing When to Answer: Adaptive Confidence Refinement for Reliable Audio-Visual Question Answering
by: Tran, Dinh Phu, et al.
Published: (2026)
by: Tran, Dinh Phu, et al.
Published: (2026)
"I am bad": Interpreting Stealthy, Universal and Robust Audio Jailbreaks in Audio-Language Models
by: Gupta, Isha, et al.
Published: (2025)
by: Gupta, Isha, et al.
Published: (2025)
How to Label Resynthesized Audio: The Dual Role of Neural Audio Codecs in Audio Deepfake Detection
by: Xiao, Yixuan, et al.
Published: (2026)
by: Xiao, Yixuan, et al.
Published: (2026)
ADNAC: Audio Denoiser using Neural Audio Codec
by: Jimon, Daniel, et al.
Published: (2025)
by: Jimon, Daniel, et al.
Published: (2025)
Finite Scalar Quantization Enables Redundant and Transmission-Robust Neural Audio Compression at Low Bit-rates
by: Julian, Harry, et al.
Published: (2025)
by: Julian, Harry, et al.
Published: (2025)
RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction
by: Zhang, Yuwei, et al.
Published: (2024)
by: Zhang, Yuwei, et al.
Published: (2024)
Quantifying Multimodal Imbalance: A GMM-Guided Adaptive Loss for Audio-Visual Learning
by: Liu, Zhaocheng, et al.
Published: (2025)
by: Liu, Zhaocheng, et al.
Published: (2025)
Echo: Towards Advanced Audio Comprehension via Audio-Interleaved Reasoning
by: Wu, Daiqing, et al.
Published: (2026)
by: Wu, Daiqing, et al.
Published: (2026)
Focal Modulation Networks for Interpretable Sound Classification
by: Della Libera, Luca, et al.
Published: (2024)
by: Della Libera, Luca, et al.
Published: (2024)
Audio Processing using Pattern Recognition for Music Genre Classification
by: Chatterjee, Sivangi, et al.
Published: (2024)
by: Chatterjee, Sivangi, et al.
Published: (2024)
Multi-label Zero-Shot Audio Classification with Temporal Attention
by: Dogan, Duygu, et al.
Published: (2024)
by: Dogan, Duygu, et al.
Published: (2024)
Virtual Consistency for Audio Editing
by: Cervera, Matthieu, et al.
Published: (2025)
by: Cervera, Matthieu, et al.
Published: (2025)
Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification
by: Zhu, Wentao
Published: (2024)
by: Zhu, Wentao
Published: (2024)
On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification
by: Heggan, Calum, et al.
Published: (2024)
by: Heggan, Calum, et al.
Published: (2024)
Exploring Meta Information for Audio-based Zero-shot Bird Classification
by: Gebhard, Alexander, et al.
Published: (2023)
by: Gebhard, Alexander, et al.
Published: (2023)
An Investigation of Test-time Adaptation for Audio Classification under Background Noise
by: Shao, Weichuang, et al.
Published: (2025)
by: Shao, Weichuang, et al.
Published: (2025)
PACE: Pretrained Audio Continual Learning
by: Li, Chang, et al.
Published: (2026)
by: Li, Chang, et al.
Published: (2026)
Segmentwise Pruning in Audio-Language Models
by: Gibier, Marcel, et al.
Published: (2025)
by: Gibier, Marcel, et al.
Published: (2025)
Adapting Neural Audio Codecs to EEG
by: Kastrati, Ard, et al.
Published: (2025)
by: Kastrati, Ard, et al.
Published: (2025)
Semantic-Aware Interpretable Multimodal Music Auto-Tagging
by: Patakis, Andreas, et al.
Published: (2025)
by: Patakis, Andreas, et al.
Published: (2025)
Predicting User Intents and Musical Attributes from Music Discovery Conversations
by: Kwon, Daeyong, et al.
Published: (2024)
by: Kwon, Daeyong, et al.
Published: (2024)
Audio Super-Resolution with Latent Bridge Models
by: Li, Chang, et al.
Published: (2025)
by: Li, Chang, et al.
Published: (2025)
Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification
by: Bae, Sangmin, et al.
Published: (2023)
by: Bae, Sangmin, et al.
Published: (2023)
Mixture of Low-Rank Adapter Experts in Generalizable Audio Deepfake Detection
by: Laakkonen, Janne, et al.
Published: (2025)
by: Laakkonen, Janne, et al.
Published: (2025)
Similar Items
-
Hierarchical Text Classification Using Black Box Large Language Models
by: Yoshimura, Kosuke, et al.
Published: (2025) -
AaSP: Aliasing-aware Self-Supervised Pre-Training for Audio Spectrogram Transformers
by: Yamamoto, Kohei, et al.
Published: (2025) -
Baseline Systems For The 2025 Low-Resource Audio Codec Challenge
by: Isik, Yusuf Ziya, et al.
Published: (2025) -
Low-Resource Guidance for Controllable Latent Audio Diffusion
by: Novack, Zachary, et al.
Published: (2026) -
APEX: Audio Prototype EXplanations for Classification Tasks
by: Kawa, Piotr, et al.
Published: (2026)