Guardado en:
Detalles Bibliográficos
Autores principales: Gupta, Ankit, Adami, Christoph, Dolson, Emily
Formato: Preprint
Publicado: 2025
Materias:
Acceso en línea:https://arxiv.org/abs/2512.06185
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866909946775339008
author Gupta, Ankit
Adami, Christoph
Dolson, Emily
author_facet Gupta, Ankit
Adami, Christoph
Dolson, Emily
contents Deep neural networks (DNNs) excel across image recognition tasks, yet continue to exhibit overconfidence on inputs that bear no resemblance to natural images. Revisiting the "fooling images" work introduced by Nguyen et al. (2015), we re-implement both CPPN-based and direct-encoding-based evolutionary fooling attacks on modern architectures, including convolutional and transformer classifiers. Our re-implementation confirm that high-confidence fooling persists even in state-of-the-art networks, with transformer-based ViT-B/16 emerging as the most susceptible--achieving near-certain misclassifications with substantially fewer queries than convolution-based models. We then introduce SPOOF, a minimalist, consistent, and more efficient black-box attack generating high-confidence fooling images. Despite its simplicity, SPOOF generates unrecognizable fooling images with minimal pixel modifications and drastically reduced compute. Furthermore, retraining with fooling images as an additional class provides only partial resistance, as SPOOF continues to fool consistently with slightly higher query budgets--highlighting persistent fragility of modern deep classifiers.
format Preprint
id arxiv_https___arxiv_org_abs_2512_06185
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling
Gupta, Ankit
Adami, Christoph
Dolson, Emily
Computer Vision and Pattern Recognition
Deep neural networks (DNNs) excel across image recognition tasks, yet continue to exhibit overconfidence on inputs that bear no resemblance to natural images. Revisiting the "fooling images" work introduced by Nguyen et al. (2015), we re-implement both CPPN-based and direct-encoding-based evolutionary fooling attacks on modern architectures, including convolutional and transformer classifiers. Our re-implementation confirm that high-confidence fooling persists even in state-of-the-art networks, with transformer-based ViT-B/16 emerging as the most susceptible--achieving near-certain misclassifications with substantially fewer queries than convolution-based models. We then introduce SPOOF, a minimalist, consistent, and more efficient black-box attack generating high-confidence fooling images. Despite its simplicity, SPOOF generates unrecognizable fooling images with minimal pixel modifications and drastically reduced compute. Furthermore, retraining with fooling images as an additional class provides only partial resistance, as SPOOF continues to fool consistently with slightly higher query budgets--highlighting persistent fragility of modern deep classifiers.
title SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2512.06185