Salvato in:
Dettagli Bibliografici
Autori principali: Wu, Qilong, Chandrasekaran, Varun
Natura: Preprint
Pubblicazione: 2024
Soggetti:
Accesso online:https://arxiv.org/abs/2412.02576
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866913595801993216
author Wu, Qilong
Chandrasekaran, Varun
author_facet Wu, Qilong
Chandrasekaran, Varun
contents Watermarking approaches are widely used to identify if images being circulated are authentic or AI-generated. Determining the robustness of image watermarking methods in the ``no-box'' setting, where the attacker is assumed to have no knowledge about the watermarking model, is an interesting problem. Our main finding is that evading the no-box setting is challenging: the success of optimization-based transfer attacks (involving training surrogate models) proposed in prior work~\cite{hu2024transfer} depends on impractical assumptions, including (i) aligning the architecture and training configurations of both the victim and attacker's surrogate watermarking models, as well as (ii) a large number of surrogate models with potentially large computational requirements. Relaxing these assumptions i.e., moving to a more pragmatic threat model results in a failed attack, with an evasion rate at most $21.1\%$. We show that when the configuration is mostly aligned, a simple non-optimization attack we propose, OFT, with one single surrogate model can already exceed the success of optimization-based efforts. Under the same $\ell_\infty$ norm perturbation budget of $0.25$, prior work~\citet{hu2024transfer} is comparable to or worse than OFT in $11$ out of $12$ configurations and has a limited advantage on the remaining one. The code used for all our experiments is available at \url{https://github.com/Ardor-Wu/transfer}.
format Preprint
id arxiv_https___arxiv_org_abs_2412_02576
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle The Efficacy of Transfer-based No-box Attacks on Image Watermarking: A Pragmatic Analysis
Wu, Qilong
Chandrasekaran, Varun
Cryptography and Security
Watermarking approaches are widely used to identify if images being circulated are authentic or AI-generated. Determining the robustness of image watermarking methods in the ``no-box'' setting, where the attacker is assumed to have no knowledge about the watermarking model, is an interesting problem. Our main finding is that evading the no-box setting is challenging: the success of optimization-based transfer attacks (involving training surrogate models) proposed in prior work~\cite{hu2024transfer} depends on impractical assumptions, including (i) aligning the architecture and training configurations of both the victim and attacker's surrogate watermarking models, as well as (ii) a large number of surrogate models with potentially large computational requirements. Relaxing these assumptions i.e., moving to a more pragmatic threat model results in a failed attack, with an evasion rate at most $21.1\%$. We show that when the configuration is mostly aligned, a simple non-optimization attack we propose, OFT, with one single surrogate model can already exceed the success of optimization-based efforts. Under the same $\ell_\infty$ norm perturbation budget of $0.25$, prior work~\citet{hu2024transfer} is comparable to or worse than OFT in $11$ out of $12$ configurations and has a limited advantage on the remaining one. The code used for all our experiments is available at \url{https://github.com/Ardor-Wu/transfer}.
title The Efficacy of Transfer-based No-box Attacks on Image Watermarking: A Pragmatic Analysis
topic Cryptography and Security
url https://arxiv.org/abs/2412.02576