Saved in:
Bibliographic Details
Main Authors: Chen, Lu, Li, Shaofeng, Huang, Benhao, Yang, Fan, Li, Zheng, Li, Jie, Luo, Yuan
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.02095
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910736609968128
author Chen, Lu
Li, Shaofeng
Huang, Benhao
Yang, Fan
Li, Zheng
Li, Jie
Luo, Yuan
author_facet Chen, Lu
Li, Shaofeng
Huang, Benhao
Yang, Fan
Li, Zheng
Li, Jie
Luo, Yuan
contents Existing works have extensively studied adversarial examples, which are minimal perturbations that can mislead the output of deep neural networks (DNNs) while remaining imperceptible to humans. However, in this work, we reveal the existence of a harmless perturbation space, in which perturbations drawn from this space, regardless of their magnitudes, leave the network output unchanged when applied to inputs. Essentially, the harmless perturbation space emerges from the usage of non-injective functions (linear or non-linear layers) within DNNs, enabling multiple distinct inputs to be mapped to the same output. For linear layers with input dimensions exceeding output dimensions, any linear combination of the orthogonal bases of the nullspace of the parameter consistently yields no change in their output. For non-linear layers, the harmless perturbation space may expand, depending on the properties of the layers and input samples. Inspired by this property of DNNs, we solve for a family of general perturbation spaces that are redundant for the DNN's decision, and can be used to hide sensitive data and serve as a means of model identification. Our work highlights the distinctive robustness of DNNs (i.e., consistency under large magnitude perturbations) in contrast to adversarial examples (vulnerability for small imperceptible noises).
format Preprint
id arxiv_https___arxiv_org_abs_2402_02095
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Contrasting Adversarial Perturbations: The Space of Harmless Perturbations
Chen, Lu
Li, Shaofeng
Huang, Benhao
Yang, Fan
Li, Zheng
Li, Jie
Luo, Yuan
Machine Learning
Existing works have extensively studied adversarial examples, which are minimal perturbations that can mislead the output of deep neural networks (DNNs) while remaining imperceptible to humans. However, in this work, we reveal the existence of a harmless perturbation space, in which perturbations drawn from this space, regardless of their magnitudes, leave the network output unchanged when applied to inputs. Essentially, the harmless perturbation space emerges from the usage of non-injective functions (linear or non-linear layers) within DNNs, enabling multiple distinct inputs to be mapped to the same output. For linear layers with input dimensions exceeding output dimensions, any linear combination of the orthogonal bases of the nullspace of the parameter consistently yields no change in their output. For non-linear layers, the harmless perturbation space may expand, depending on the properties of the layers and input samples. Inspired by this property of DNNs, we solve for a family of general perturbation spaces that are redundant for the DNN's decision, and can be used to hide sensitive data and serve as a means of model identification. Our work highlights the distinctive robustness of DNNs (i.e., consistency under large magnitude perturbations) in contrast to adversarial examples (vulnerability for small imperceptible noises).
title Contrasting Adversarial Perturbations: The Space of Harmless Perturbations
topic Machine Learning
url https://arxiv.org/abs/2402.02095