Saved in:
| Main Authors: | Chen, Yen-Shan, Lai, Shih-Yu, Tsou, Ying-Jung, Lin, Yi-Cheng, Chen, Bing-Yu, Chen, Yun-Nung, Lee, Hung-yi, Chen, Shang-Tse |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.05310 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
WavMark: Watermarking for Audio Generation
by: Chen, Guangyu, et al.
Published: (2023)
by: Chen, Guangyu, et al.
Published: (2023)
Latent Granular Resynthesis using Neural Audio Codecs
by: Tokui, Nao, et al.
Published: (2025)
by: Tokui, Nao, et al.
Published: (2025)
LLM-Codec: Neural Audio Codec Meets Language Model Objectives
by: Chung, Ho-Lam, et al.
Published: (2026)
by: Chung, Ho-Lam, et al.
Published: (2026)
ALICE: A Multifaceted Evaluation Framework of Large Audio-Language Models' In-Context Learning Ability
by: Piao, Yen-Ting, et al.
Published: (2026)
by: Piao, Yen-Ting, et al.
Published: (2026)
When Silence Matters: The Impact of Irrelevant Audio on Text Reasoning in Large Audio-Language Models
by: Li, Chen-An, et al.
Published: (2025)
by: Li, Chen-An, et al.
Published: (2025)
Causal Tracing of Audio-Text Fusion in Large Audio Language Models
by: Chen, Wei-Chih, et al.
Published: (2026)
by: Chen, Wei-Chih, et al.
Published: (2026)
AudioMarkBench: Benchmarking Robustness of Audio Watermarking
by: Liu, Hongbin, et al.
Published: (2024)
by: Liu, Hongbin, et al.
Published: (2024)
Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition
by: Su, Hsuan, et al.
Published: (2024)
by: Su, Hsuan, et al.
Published: (2024)
WAKE: Watermarking Audio with Key Enrichment
by: Xu, Yaoxun, et al.
Published: (2025)
by: Xu, Yaoxun, et al.
Published: (2025)
Hearing the Order: Investigating Position Bias in Large Audio-Language Models
by: Lin, Yu-Xiang, et al.
Published: (2025)
by: Lin, Yu-Xiang, et al.
Published: (2025)
Robust Distortion-Free Watermark for Autoregressive Audio Generation Models
by: Wu, Yihan, et al.
Published: (2025)
by: Wu, Yihan, et al.
Published: (2025)
SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models
by: Yang, Chih-Kai, et al.
Published: (2025)
by: Yang, Chih-Kai, et al.
Published: (2025)
Audio Jailbreaks in Large Audio-Language Models: Taxonomy, Attack-Defense Analysis, and Cost-Aware Evaluation
by: Feng, Bo-Han, et al.
Published: (2026)
by: Feng, Bo-Han, et al.
Published: (2026)
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning
by: Kuan, Chun-Yi, et al.
Published: (2024)
by: Kuan, Chun-Yi, et al.
Published: (2024)
SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information
by: Yang, Chih-Kai, et al.
Published: (2025)
by: Yang, Chih-Kai, et al.
Published: (2025)
XAttnMark: Learning Robust Audio Watermarking with Cross-Attention
by: Liu, Yixin, et al.
Published: (2025)
by: Liu, Yixin, et al.
Published: (2025)
TW-Sound580K: A Regional Audio-Text Dataset with Verification-Guided Curation for Localized Audio-Language Modeling
by: Xie, Hao-Hui, et al.
Published: (2026)
by: Xie, Hao-Hui, et al.
Published: (2026)
MI-Fuse: Label Fusion for Unsupervised Domain Adaptation with Closed-Source Large-Audio Language Model
by: Huang, Hsiao-Ying, et al.
Published: (2025)
by: Huang, Hsiao-Ying, et al.
Published: (2025)
Latent Watermarking of Audio Generative Models
by: Roman, Robin San, et al.
Published: (2024)
by: Roman, Robin San, et al.
Published: (2024)
Toward Fair Speech Technologies: A Comprehensive Survey of Bias and Fairness in Speech AI
by: Lin, Yi-Cheng, et al.
Published: (2026)
by: Lin, Yi-Cheng, et al.
Published: (2026)
Quantum-Trained Convolutional Neural Network for Deepfake Audio Detection
by: Lin, Chu-Hsuan Abraham, et al.
Published: (2024)
by: Lin, Chu-Hsuan Abraham, et al.
Published: (2024)
Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples
by: Kuan, Chun-Yi, et al.
Published: (2025)
by: Kuan, Chun-Yi, et al.
Published: (2025)
MUGEN: Evaluating and Improving Multi-audio Understanding of Large Audio-Language Models
by: Yang, Chih-Kai, et al.
Published: (2026)
by: Yang, Chih-Kai, et al.
Published: (2026)
Codec-Based Deepfake Source Tracing via Neural Audio Codec Taxonomy
by: Chen, Xuanjun, et al.
Published: (2025)
by: Chen, Xuanjun, et al.
Published: (2025)
Investigating Safety Vulnerabilities of Large Audio-Language Models Under Speaker Emotional Variations
by: Feng, Bo-Han, et al.
Published: (2025)
by: Feng, Bo-Han, et al.
Published: (2025)
P2Mark: Plug-and-play Parameter-level Watermarking for Neural Speech Generation
by: Ren, Yong, et al.
Published: (2025)
by: Ren, Yong, et al.
Published: (2025)
From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data
by: Kuan, Chun-Yi, et al.
Published: (2025)
by: Kuan, Chun-Yi, et al.
Published: (2025)
ToxicTone: A Mandarin Audio Dataset Annotated for Toxicity and Toxic Utterance Tonality
by: Luo, Yu-Xiang, et al.
Published: (2025)
by: Luo, Yu-Xiang, et al.
Published: (2025)
SincQDR-VAD: A Noise-Robust Voice Activity Detection Framework Leveraging Learnable Filters and Ranking-Aware Optimization
by: Wang, Chien-Chun, et al.
Published: (2025)
by: Wang, Chien-Chun, et al.
Published: (2025)
All That Glitters Is Not Audio: Rethinking Text Priors and Audio Reliance in Audio-Language Evaluation
by: Foo, Leonardo Haw-Yang, et al.
Published: (2026)
by: Foo, Leonardo Haw-Yang, et al.
Published: (2026)
AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering
by: Kuan, Chun-Yi, et al.
Published: (2026)
by: Kuan, Chun-Yi, et al.
Published: (2026)
Measuring the Robustness of Audio Deepfake Detectors
by: Li, Xiang, et al.
Published: (2025)
by: Li, Xiang, et al.
Published: (2025)
ASTAR-NTU solution to AudioMOS Challenge 2025 Track1
by: Ritter-Gutierrez, Fabian, et al.
Published: (2025)
by: Ritter-Gutierrez, Fabian, et al.
Published: (2025)
AQAScore: Evaluating Semantic Alignment in Text-to-Audio Generation via Audio Question Answering
by: Kuan, Chun-Yi, et al.
Published: (2026)
by: Kuan, Chun-Yi, et al.
Published: (2026)
Nudging Hidden States: Training-Free Model Steering for Chain-of-Thought Reasoning in Large Audio-Language Models
by: Ieong, Lok-Lam, et al.
Published: (2026)
by: Ieong, Lok-Lam, et al.
Published: (2026)
CodecFake+: A Large-Scale Neural Audio Codec-Based Deepfake Speech Dataset
by: Chen, Xuanjun, et al.
Published: (2025)
by: Chen, Xuanjun, et al.
Published: (2025)
TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling
by: Tseng, Liang-Hsuan, et al.
Published: (2025)
by: Tseng, Liang-Hsuan, et al.
Published: (2025)
AudioLens: A Closer Look at Auditory Attribute Perception of Large Audio-Language Models
by: Yang, Chih-Kai, et al.
Published: (2025)
by: Yang, Chih-Kai, et al.
Published: (2025)
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
by: Tseng, Yuan, et al.
Published: (2023)
by: Tseng, Yuan, et al.
Published: (2023)
DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset
by: Du, Jiawei, et al.
Published: (2024)
by: Du, Jiawei, et al.
Published: (2024)
Similar Items
-
WavMark: Watermarking for Audio Generation
by: Chen, Guangyu, et al.
Published: (2023) -
Latent Granular Resynthesis using Neural Audio Codecs
by: Tokui, Nao, et al.
Published: (2025) -
LLM-Codec: Neural Audio Codec Meets Language Model Objectives
by: Chung, Ho-Lam, et al.
Published: (2026) -
ALICE: A Multifaceted Evaluation Framework of Large Audio-Language Models' In-Context Learning Ability
by: Piao, Yen-Ting, et al.
Published: (2026) -
When Silence Matters: The Impact of Irrelevant Audio on Text Reasoning in Large Audio-Language Models
by: Li, Chen-An, et al.
Published: (2025)