Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Hongwei, Xu, Xiaoyin, An, Dongsheng, Gu, Xianfeng, Zhang, Min
Format:	Preprint
Published:	2024
Subjects:	Cryptography and Security Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2403.07463
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914710930063360
author	Zhang, Hongwei Xu, Xiaoyin An, Dongsheng Gu, Xianfeng Zhang, Min
author_facet	Zhang, Hongwei Xu, Xiaoyin An, Dongsheng Gu, Xianfeng Zhang, Min
contents	Backdoor attacks become a significant security concern for deep neural networks in recent years. An image classification model can be compromised if malicious backdoors are injected into it. This corruption will cause the model to function normally on clean images but predict a specific target label when triggers are present. Previous research can be categorized into two genres: poisoning a portion of the dataset with triggered images for users to train the model from scratch, or training a backdoored model alongside a triggered image generator. Both approaches require significant amount of attackable parameters for optimization to establish a connection between the trigger and the target label, which may raise suspicions as more people become aware of the existence of backdoor attacks. In this paper, we propose a backdoor attack paradigm that only requires minimal alterations (specifically, the output layer) to a clean model in order to inject the backdoor under the guise of fine-tuning. To achieve this, we leverage mode mixture samples, which are located between different modes in latent space, and introduce a novel method for conducting backdoor attacks. We evaluate the effectiveness of our method on four popular benchmark datasets: MNIST, CIFAR-10, GTSRB, and TinyImageNet.
format	Preprint
id	arxiv_https___arxiv_org_abs_2403_07463
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Backdoor Attack with Mode Mixture Latent Modification Zhang, Hongwei Xu, Xiaoyin An, Dongsheng Gu, Xianfeng Zhang, Min Cryptography and Security Computer Vision and Pattern Recognition Backdoor attacks become a significant security concern for deep neural networks in recent years. An image classification model can be compromised if malicious backdoors are injected into it. This corruption will cause the model to function normally on clean images but predict a specific target label when triggers are present. Previous research can be categorized into two genres: poisoning a portion of the dataset with triggered images for users to train the model from scratch, or training a backdoored model alongside a triggered image generator. Both approaches require significant amount of attackable parameters for optimization to establish a connection between the trigger and the target label, which may raise suspicions as more people become aware of the existence of backdoor attacks. In this paper, we propose a backdoor attack paradigm that only requires minimal alterations (specifically, the output layer) to a clean model in order to inject the backdoor under the guise of fine-tuning. To achieve this, we leverage mode mixture samples, which are located between different modes in latent space, and introduce a novel method for conducting backdoor attacks. We evaluate the effectiveness of our method on four popular benchmark datasets: MNIST, CIFAR-10, GTSRB, and TinyImageNet.
title	Backdoor Attack with Mode Mixture Latent Modification
topic	Cryptography and Security Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2403.07463

Similar Items