MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Sun, Jiachen, Wang, Changsheng, Wang, Jiongxiao, Zhang, Yiwei, Xiao, Chaowei
Natura:	Preprint
Pubblicazione:	2024
Soggetti:	Computer Vision and Pattern Recognition Artificial Intelligence I.2.7; I.4
Accesso online:	https://arxiv.org/abs/2405.10529
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866929472190545920
author	Sun, Jiachen Wang, Changsheng Wang, Jiongxiao Zhang, Yiwei Xiao, Chaowei
author_facet	Sun, Jiachen Wang, Changsheng Wang, Jiongxiao Zhang, Yiwei Xiao, Chaowei
contents	Large language models have become increasingly prominent, also signaling a shift towards multimodality as the next frontier in artificial intelligence, where their embeddings are harnessed as prompts to generate textual content. Vision-language models (VLMs) stand at the forefront of this advancement, offering innovative ways to combine visual and textual data for enhanced understanding and interaction. However, this integration also enlarges the attack surface. Patch-based adversarial attack is considered the most realistic threat model in physical vision applications, as demonstrated in many existing literature. In this paper, we propose to address patched visual prompt injection, where adversaries exploit adversarial patches to generate target content in VLMs. Our investigation reveals that patched adversarial prompts exhibit sensitivity to pixel-wise randomization, a trait that remains robust even against adaptive attacks designed to counteract such defenses. Leveraging this insight, we introduce SmoothVLM, a defense mechanism rooted in smoothing techniques, specifically tailored to protect VLMs from the threat of patched visual prompt injectors. Our framework significantly lowers the attack success rate to a range between 0% and 5.0% on two leading VLMs, while achieving around 67.3% to 95.0% context recovery of the benign images, demonstrating a balance between security and usability.
format	Preprint
id	arxiv_https___arxiv_org_abs_2405_10529
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Safeguarding Vision-Language Models Against Patched Visual Prompt Injectors Sun, Jiachen Wang, Changsheng Wang, Jiongxiao Zhang, Yiwei Xiao, Chaowei Computer Vision and Pattern Recognition Artificial Intelligence I.2.7; I.4 Large language models have become increasingly prominent, also signaling a shift towards multimodality as the next frontier in artificial intelligence, where their embeddings are harnessed as prompts to generate textual content. Vision-language models (VLMs) stand at the forefront of this advancement, offering innovative ways to combine visual and textual data for enhanced understanding and interaction. However, this integration also enlarges the attack surface. Patch-based adversarial attack is considered the most realistic threat model in physical vision applications, as demonstrated in many existing literature. In this paper, we propose to address patched visual prompt injection, where adversaries exploit adversarial patches to generate target content in VLMs. Our investigation reveals that patched adversarial prompts exhibit sensitivity to pixel-wise randomization, a trait that remains robust even against adaptive attacks designed to counteract such defenses. Leveraging this insight, we introduce SmoothVLM, a defense mechanism rooted in smoothing techniques, specifically tailored to protect VLMs from the threat of patched visual prompt injectors. Our framework significantly lowers the attack success rate to a range between 0% and 5.0% on two leading VLMs, while achieving around 67.3% to 95.0% context recovery of the benign images, demonstrating a balance between security and usability.
title	Safeguarding Vision-Language Models Against Patched Visual Prompt Injectors
topic	Computer Vision and Pattern Recognition Artificial Intelligence I.2.7; I.4
url	https://arxiv.org/abs/2405.10529

Documenti analoghi