Saved in:
| Main Authors: | Sun, Houmin, Hu, Zi, Li, Linxi, Wang, Yechen, Jin, Liwei, Li, Ming |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.16805 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CompSpoof: A Dataset and Joint Learning Framework for Component-Level Audio Anti-spoofing Countermeasures
by: Zhang, Xueping, et al.
Published: (2025)
by: Zhang, Xueping, et al.
Published: (2025)
MultiAPI Spoof: A Multi-API Dataset and Local-Attention Network for Speech Anti-spoofing Detection
by: Zhang, Xueping, et al.
Published: (2025)
by: Zhang, Xueping, et al.
Published: (2025)
The Impact of Audio Watermarking on Audio Anti-Spoofing Countermeasures
by: Zhang, Zhenshan, et al.
Published: (2025)
by: Zhang, Zhenshan, et al.
Published: (2025)
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
by: Li, Guinan, et al.
Published: (2024)
by: Li, Guinan, et al.
Published: (2024)
Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation
by: Bai, Ye, et al.
Published: (2024)
by: Bai, Ye, et al.
Published: (2024)
Audio-Visual Speech Enhancement In Complex Scenarios With Separation And Dereverberation Joint Modeling
by: Du, Jiarong, et al.
Published: (2025)
by: Du, Jiarong, et al.
Published: (2025)
Training-Free Multi-Step Audio Source Separation
by: Zang, Yongyi, et al.
Published: (2025)
by: Zang, Yongyi, et al.
Published: (2025)
HarmonicAttack: An Adaptive Cross-Domain Audio Watermark Removal
by: Li, Kexin, et al.
Published: (2025)
by: Li, Kexin, et al.
Published: (2025)
Audio Pirates: Black-box Audio Watermark Removal via Diffusion Priors
by: Yao, Lingfeng, et al.
Published: (2026)
by: Yao, Lingfeng, et al.
Published: (2026)
Multi-bit Audio Watermarking
by: Lanzendörfer, Luca A., et al.
Published: (2025)
by: Lanzendörfer, Luca A., et al.
Published: (2025)
AST: Adaptive, Seamless, and Training-Free Precise Speech Editing
by: Lv, Sihan, et al.
Published: (2026)
by: Lv, Sihan, et al.
Published: (2026)
Deep Audio Watermarks are Shallow: Limitations of Post-Hoc Watermarking Techniques for Speech
by: O'Reilly, Patrick, et al.
Published: (2025)
by: O'Reilly, Patrick, et al.
Published: (2025)
SyncGuard: Robust Audio Watermarking Capable of Countering Desynchronization Attacks
by: Gan, Zhenliang, et al.
Published: (2025)
by: Gan, Zhenliang, et al.
Published: (2025)
Ti-Audio: The First Multi-Dialectal End-to-End Speech LLM for Tibetan
by: Wang, Jialing, et al.
Published: (2026)
by: Wang, Jialing, et al.
Published: (2026)
Audio-Visual Separation with Hierarchical Fusion and Representation Alignment
by: Hu, Han, et al.
Published: (2025)
by: Hu, Han, et al.
Published: (2025)
MelShield: Robust Mel-Domain Audio Watermarking for Provenance Attribution of AI Generated Synthesized Speech
by: Jin, Yutong, et al.
Published: (2026)
by: Jin, Yutong, et al.
Published: (2026)
Hallo-Live: Real-Time Streaming Joint Audio-Video Avatar Generation with Asynchronous Dual-Stream and Human-Centric Preference Distillation
by: Li, Chunyu, et al.
Published: (2026)
by: Li, Chunyu, et al.
Published: (2026)
Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention
by: Li, Kai, et al.
Published: (2025)
by: Li, Kai, et al.
Published: (2025)
Synaspot: A Lightweight, Streaming Multi-modal Framework for Keyword Spotting with Audio-Text Synergy
by: Li, Kewei, et al.
Published: (2025)
by: Li, Kewei, et al.
Published: (2025)
Streaming Audio Transformers for Online Audio Tagging
by: Dinkel, Heinrich, et al.
Published: (2023)
by: Dinkel, Heinrich, et al.
Published: (2023)
Lightweight Joint Audio-Visual Deepfake Detection via Single-Stream Multi-Modal Learning Framework
by: Zhang, Kuiyuan, et al.
Published: (2025)
by: Zhang, Kuiyuan, et al.
Published: (2025)
WAKE: Watermarking Audio with Key Enrichment
by: Xu, Yaoxun, et al.
Published: (2025)
by: Xu, Yaoxun, et al.
Published: (2025)
Latent Watermarking of Audio Generative Models
by: Roman, Robin San, et al.
Published: (2024)
by: Roman, Robin San, et al.
Published: (2024)
Robust Distortion-Free Watermark for Autoregressive Audio Generation Models
by: Wu, Yihan, et al.
Published: (2025)
by: Wu, Yihan, et al.
Published: (2025)
Self Voice Conversion as an Attack against Neural Audio Watermarking
by: Özer, Yigitcan, et al.
Published: (2026)
by: Özer, Yigitcan, et al.
Published: (2026)
A Fast and Lightweight Model for Causal Audio-Visual Speech Separation
by: Sang, Wendi, et al.
Published: (2025)
by: Sang, Wendi, et al.
Published: (2025)
HoliAntiSpoof: Audio LLM for Holistic Speech Anti-Spoofing
by: Xu, Xuenan, et al.
Published: (2026)
by: Xu, Xuenan, et al.
Published: (2026)
PromptSep: Generative Audio Separation via Multimodal Prompting
by: Wen, Yutong, et al.
Published: (2025)
by: Wen, Yutong, et al.
Published: (2025)
XAttnMark: Learning Robust Audio Watermarking with Cross-Attention
by: Liu, Yixin, et al.
Published: (2025)
by: Liu, Yixin, et al.
Published: (2025)
Separate First, Fuse Later: Mitigating Cross-Modal Interference in Audio-Visual LLMs Reasoning with Modality-Specific Chain-of-Thought
by: Li, Xuanchen, et al.
Published: (2026)
by: Li, Xuanchen, et al.
Published: (2026)
AudioMarkBench: Benchmarking Robustness of Audio Watermarking
by: Liu, Hongbin, et al.
Published: (2024)
by: Liu, Hongbin, et al.
Published: (2024)
IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation
by: Li, Kai, et al.
Published: (2023)
by: Li, Kai, et al.
Published: (2023)
AudioGenie-Reasoner: A Training-Free Multi-Agent Framework for Coarse-to-Fine Audio Deep Reasoning
by: Rong, Yan, et al.
Published: (2025)
by: Rong, Yan, et al.
Published: (2025)
Whisper-AuT: Domain-Adapted Audio Encoder for Efficient Audio-LLM Training
by: Qiu, Jielin, et al.
Published: (2026)
by: Qiu, Jielin, et al.
Published: (2026)
Latent-Mark: An Audio Watermark Robust to Neural Resynthesis
by: Chen, Yen-Shan, et al.
Published: (2026)
by: Chen, Yen-Shan, et al.
Published: (2026)
TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion
by: Pegg, Samuel, et al.
Published: (2024)
by: Pegg, Samuel, et al.
Published: (2024)
From Coarse to Fine: Recursive Audio-Visual Semantic Enhancement for Speech Separation
by: Xue, Ke, et al.
Published: (2025)
by: Xue, Ke, et al.
Published: (2025)
WavMark: Watermarking for Audio Generation
by: Chen, Guangyu, et al.
Published: (2023)
by: Chen, Guangyu, et al.
Published: (2023)
SilentCipher: Deep Audio Watermarking
by: Singh, Mayank Kumar, et al.
Published: (2024)
by: Singh, Mayank Kumar, et al.
Published: (2024)
An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits
by: Li, Kai, et al.
Published: (2022)
by: Li, Kai, et al.
Published: (2022)
Similar Items
-
CompSpoof: A Dataset and Joint Learning Framework for Component-Level Audio Anti-spoofing Countermeasures
by: Zhang, Xueping, et al.
Published: (2025) -
MultiAPI Spoof: A Multi-API Dataset and Local-Attention Network for Speech Anti-spoofing Detection
by: Zhang, Xueping, et al.
Published: (2025) -
The Impact of Audio Watermarking on Audio Anti-Spoofing Countermeasures
by: Zhang, Zhenshan, et al.
Published: (2025) -
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
by: Li, Guinan, et al.
Published: (2024) -
Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation
by: Bai, Ye, et al.
Published: (2024)