Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Chen, Yen-Shan, Lai, Shih-Yu, Tsou, Ying-Jung, Lin, Yi-Cheng, Chen, Bing-Yu, Chen, Yun-Nung, Lee, Hung-yi, Chen, Shang-Tse
Format:	Preprint
Published:	2026
Subjects:	Sound Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.05310
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915863807918080
author	Chen, Yen-Shan Lai, Shih-Yu Tsou, Ying-Jung Lin, Yi-Cheng Chen, Bing-Yu Chen, Yun-Nung Lee, Hung-yi Chen, Shang-Tse
author_facet	Chen, Yen-Shan Lai, Shih-Yu Tsou, Ying-Jung Lin, Yi-Cheng Chen, Bing-Yu Chen, Yun-Nung Lee, Hung-yi Chen, Shang-Tse
contents	While existing audio watermarking techniques have achieved strong robustness against traditional digital signal processing (DSP) attacks, they remain vulnerable to neural resynthesis. This occurs because modern neural audio codecs act as semantic filters and discard the imperceptible waveform variations used in prior watermarking methods. To address this limitation, we propose Latent-Mark, the first zero-bit audio watermarking framework designed to survive semantic compression. Our key insight is that robustness to the encode-decode process requires embedding the watermark within the codec's invariant latent space. We achieve this by optimizing the audio waveform to induce a detectable directional shift in its encoded latent representation, while constraining perturbations to align with the natural audio manifold to ensure imperceptibility. To prevent overfitting to a single codec's quantization rules, we introduce Cross-Codec Optimization, jointly optimizing the waveform across multiple surrogate codecs to target shared latent invariants. Extensive evaluations demonstrate robust zero-shot transferability to unseen neural codecs, achieving state-of-the-art resilience against traditional DSP attacks while preserving perceptual imperceptibility. Our work inspires future research into universal watermarking frameworks capable of maintaining integrity across increasingly complex and diverse generative distortions.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_05310
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Latent-Mark: An Audio Watermark Robust to Neural Resynthesis Chen, Yen-Shan Lai, Shih-Yu Tsou, Ying-Jung Lin, Yi-Cheng Chen, Bing-Yu Chen, Yun-Nung Lee, Hung-yi Chen, Shang-Tse Sound Artificial Intelligence While existing audio watermarking techniques have achieved strong robustness against traditional digital signal processing (DSP) attacks, they remain vulnerable to neural resynthesis. This occurs because modern neural audio codecs act as semantic filters and discard the imperceptible waveform variations used in prior watermarking methods. To address this limitation, we propose Latent-Mark, the first zero-bit audio watermarking framework designed to survive semantic compression. Our key insight is that robustness to the encode-decode process requires embedding the watermark within the codec's invariant latent space. We achieve this by optimizing the audio waveform to induce a detectable directional shift in its encoded latent representation, while constraining perturbations to align with the natural audio manifold to ensure imperceptibility. To prevent overfitting to a single codec's quantization rules, we introduce Cross-Codec Optimization, jointly optimizing the waveform across multiple surrogate codecs to target shared latent invariants. Extensive evaluations demonstrate robust zero-shot transferability to unseen neural codecs, achieving state-of-the-art resilience against traditional DSP attacks while preserving perceptual imperceptibility. Our work inspires future research into universal watermarking frameworks capable of maintaining integrity across increasingly complex and diverse generative distortions.
title	Latent-Mark: An Audio Watermark Robust to Neural Resynthesis
topic	Sound Artificial Intelligence
url	https://arxiv.org/abs/2603.05310

Similar Items