Saved in:
Bibliographic Details
Main Authors: Amerehi, Fatemeh, Healy, Patrick
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.11034
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912329427320832
author Amerehi, Fatemeh
Healy, Patrick
author_facet Amerehi, Fatemeh
Healy, Patrick
contents Adversarial training is a common strategy for enhancing model robustness against adversarial attacks. However, it is typically tailored to the specific attack types it is trained on, limiting its ability to generalize to unseen threat models. Adversarial purification offers an alternative by leveraging a generative model to remove perturbations before classification. Since the purifier is trained independently of both the classifier and the threat models, it is better equipped to handle previously unseen attack scenarios. Diffusion models have proven highly effective for noise purification, not only in countering pixel-wise adversarial perturbations but also in addressing non-adversarial data shifts. In this study, we broaden the focus beyond pixel-wise robustness to explore the extent to which purification can mitigate both spectral and spatial adversarial attacks. Our findings highlight its effectiveness in handling diverse distortion patterns across low- to high-frequency regions.
format Preprint
id arxiv_https___arxiv_org_abs_2504_11034
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Defending Against Frequency-Based Attacks with Diffusion Models
Amerehi, Fatemeh
Healy, Patrick
Computer Vision and Pattern Recognition
Adversarial training is a common strategy for enhancing model robustness against adversarial attacks. However, it is typically tailored to the specific attack types it is trained on, limiting its ability to generalize to unseen threat models. Adversarial purification offers an alternative by leveraging a generative model to remove perturbations before classification. Since the purifier is trained independently of both the classifier and the threat models, it is better equipped to handle previously unseen attack scenarios. Diffusion models have proven highly effective for noise purification, not only in countering pixel-wise adversarial perturbations but also in addressing non-adversarial data shifts. In this study, we broaden the focus beyond pixel-wise robustness to explore the extent to which purification can mitigate both spectral and spatial adversarial attacks. Our findings highlight its effectiveness in handling diverse distortion patterns across low- to high-frequency regions.
title Defending Against Frequency-Based Attacks with Diffusion Models
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2504.11034