Saved in:
Bibliographic Details
Main Authors: Yang, Zhigang, Liu, Yuan, Zhang, Jiawei, Zhang, Puning, Ma, Xinqiang
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2512.03625
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915651242688512
author Yang, Zhigang
Liu, Yuan
Zhang, Jiawei
Zhang, Puning
Ma, Xinqiang
author_facet Yang, Zhigang
Liu, Yuan
Zhang, Jiawei
Zhang, Puning
Ma, Xinqiang
contents Although the remarkable performance of deep neural networks (DNNs) in image classification, their vulnerability to adversarial attacks remains a critical challenge. Most existing detection methods rely on complex and poorly interpretable architectures, which compromise interpretability and generalization. To address this, we propose FeatureLens, a lightweight framework that acts as a lens to scrutinize anomalies in image features. Comprising an Image Feature Extractor (IFE) and shallow classifiers (e.g., SVM, MLP, or XGBoost) with model sizes ranging from 1,000 to 30,000 parameters, FeatureLens achieves high detection accuracy ranging from 97.8% to 99.75% in closed-set evaluation and 86.17% to 99.6% in generalization evaluation across FGSM, PGD, CW, and DAmageNet attacks, using only 51 dimensional features. By combining strong detection performance with excellent generalization, interpretability, and computational efficiency, FeatureLens offers a practical pathway toward transparent and effective adversarial defense.
format Preprint
id arxiv_https___arxiv_org_abs_2512_03625
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle FeatureLens: A Highly Generalizable and Interpretable Framework for Detecting Adversarial Examples Based on Image Features
Yang, Zhigang
Liu, Yuan
Zhang, Jiawei
Zhang, Puning
Ma, Xinqiang
Computer Vision and Pattern Recognition
Although the remarkable performance of deep neural networks (DNNs) in image classification, their vulnerability to adversarial attacks remains a critical challenge. Most existing detection methods rely on complex and poorly interpretable architectures, which compromise interpretability and generalization. To address this, we propose FeatureLens, a lightweight framework that acts as a lens to scrutinize anomalies in image features. Comprising an Image Feature Extractor (IFE) and shallow classifiers (e.g., SVM, MLP, or XGBoost) with model sizes ranging from 1,000 to 30,000 parameters, FeatureLens achieves high detection accuracy ranging from 97.8% to 99.75% in closed-set evaluation and 86.17% to 99.6% in generalization evaluation across FGSM, PGD, CW, and DAmageNet attacks, using only 51 dimensional features. By combining strong detection performance with excellent generalization, interpretability, and computational efficiency, FeatureLens offers a practical pathway toward transparent and effective adversarial defense.
title FeatureLens: A Highly Generalizable and Interpretable Framework for Detecting Adversarial Examples Based on Image Features
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2512.03625