Enregistré dans:
Détails bibliographiques
Auteurs principaux: Mukherjee, Amitangshu, Ibrayev, Timur, Roy, Kaushik
Format: Preprint
Publié: 2024
Sujets:
Accès en ligne:https://arxiv.org/abs/2404.00185
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866909161833365504
author Mukherjee, Amitangshu
Ibrayev, Timur
Roy, Kaushik
author_facet Mukherjee, Amitangshu
Ibrayev, Timur
Roy, Kaushik
contents Current Deep Neural Networks are vulnerable to adversarial examples, which alter their predictions by adding carefully crafted noise. Since human eyes are robust to such inputs, it is possible that the vulnerability stems from the standard way of processing inputs in one shot by processing every pixel with the same importance. In contrast, neuroscience suggests that the human vision system can differentiate salient features by (1) switching between multiple fixation points (saccades) and (2) processing the surrounding with a non-uniform external resolution (foveation). In this work, we advocate that the integration of such active vision mechanisms into current deep learning systems can offer robustness benefits. Specifically, we empirically demonstrate the inherent robustness of two active vision methods - GFNet and FALcon - under a black box threat model. By learning and inferencing based on downsampled glimpses obtained from multiple distinct fixation points within an input, we show that these active methods achieve (2-3) times greater robustness compared to a standard passive convolutional network under state-of-the-art adversarial attacks. More importantly, we provide illustrative and interpretable visualization analysis that demonstrates how performing inference from distinct fixation points makes active vision methods less vulnerable to malicious inputs.
format Preprint
id arxiv_https___arxiv_org_abs_2404_00185
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle On Inherent Adversarial Robustness of Active Vision Systems
Mukherjee, Amitangshu
Ibrayev, Timur
Roy, Kaushik
Computer Vision and Pattern Recognition
Artificial Intelligence
Current Deep Neural Networks are vulnerable to adversarial examples, which alter their predictions by adding carefully crafted noise. Since human eyes are robust to such inputs, it is possible that the vulnerability stems from the standard way of processing inputs in one shot by processing every pixel with the same importance. In contrast, neuroscience suggests that the human vision system can differentiate salient features by (1) switching between multiple fixation points (saccades) and (2) processing the surrounding with a non-uniform external resolution (foveation). In this work, we advocate that the integration of such active vision mechanisms into current deep learning systems can offer robustness benefits. Specifically, we empirically demonstrate the inherent robustness of two active vision methods - GFNet and FALcon - under a black box threat model. By learning and inferencing based on downsampled glimpses obtained from multiple distinct fixation points within an input, we show that these active methods achieve (2-3) times greater robustness compared to a standard passive convolutional network under state-of-the-art adversarial attacks. More importantly, we provide illustrative and interpretable visualization analysis that demonstrates how performing inference from distinct fixation points makes active vision methods less vulnerable to malicious inputs.
title On Inherent Adversarial Robustness of Active Vision Systems
topic Computer Vision and Pattern Recognition
Artificial Intelligence
url https://arxiv.org/abs/2404.00185