Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zheng, Mengxin, Xue, Jiaqi, Wang, Zihao, Chen, Xun, Lou, Qian, Jiang, Lei, Wang, Xiaofeng
Format:	Preprint
Published:	2023
Subjects:	Cryptography and Security Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2303.09079
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913433368133632
author	Zheng, Mengxin Xue, Jiaqi Wang, Zihao Chen, Xun Lou, Qian Jiang, Lei Wang, Xiaofeng
author_facet	Zheng, Mengxin Xue, Jiaqi Wang, Zihao Chen, Xun Lou, Qian Jiang, Lei Wang, Xiaofeng
contents	Self-supervised learning (SSL) is a prevalent approach for encoding data representations. Using a pre-trained SSL image encoder and subsequently training a downstream classifier, impressive performance can be achieved on various tasks with very little labeled data. The growing adoption of SSL has led to an increase in security research on SSL encoders and associated Trojan attacks. Trojan attacks embedded in SSL encoders can operate covertly, spreading across multiple users and devices. The presence of backdoor behavior in Trojaned encoders can inadvertently be inherited by downstream classifiers, making it even more difficult to detect and mitigate the threat. Although current Trojan detection methods in supervised learning can potentially safeguard SSL downstream classifiers, identifying and addressing triggers in the SSL encoder before its widespread dissemination is a challenging task. This challenge arises because downstream tasks might be unknown, dataset labels may be unavailable, and the original unlabeled training dataset might be inaccessible during Trojan detection in SSL encoders. We introduce SSL-Cleanse as a solution to identify and mitigate backdoor threats in SSL encoders. We evaluated SSL-Cleanse on various datasets using 1200 encoders, achieving an average detection success rate of 82.2% on ImageNet-100. After mitigating backdoors, on average, backdoored encoders achieve 0.3% attack success rate without great accuracy loss, proving the effectiveness of SSL-Cleanse. The source code of SSL-Cleanse is available at https://github.com/UCF-ML-Research/SSL-Cleanse.
format	Preprint
id	arxiv_https___arxiv_org_abs_2303_09079
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning Zheng, Mengxin Xue, Jiaqi Wang, Zihao Chen, Xun Lou, Qian Jiang, Lei Wang, Xiaofeng Cryptography and Security Computer Vision and Pattern Recognition Machine Learning Self-supervised learning (SSL) is a prevalent approach for encoding data representations. Using a pre-trained SSL image encoder and subsequently training a downstream classifier, impressive performance can be achieved on various tasks with very little labeled data. The growing adoption of SSL has led to an increase in security research on SSL encoders and associated Trojan attacks. Trojan attacks embedded in SSL encoders can operate covertly, spreading across multiple users and devices. The presence of backdoor behavior in Trojaned encoders can inadvertently be inherited by downstream classifiers, making it even more difficult to detect and mitigate the threat. Although current Trojan detection methods in supervised learning can potentially safeguard SSL downstream classifiers, identifying and addressing triggers in the SSL encoder before its widespread dissemination is a challenging task. This challenge arises because downstream tasks might be unknown, dataset labels may be unavailable, and the original unlabeled training dataset might be inaccessible during Trojan detection in SSL encoders. We introduce SSL-Cleanse as a solution to identify and mitigate backdoor threats in SSL encoders. We evaluated SSL-Cleanse on various datasets using 1200 encoders, achieving an average detection success rate of 82.2% on ImageNet-100. After mitigating backdoors, on average, backdoored encoders achieve 0.3% attack success rate without great accuracy loss, proving the effectiveness of SSL-Cleanse. The source code of SSL-Cleanse is available at https://github.com/UCF-ML-Research/SSL-Cleanse.
title	SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning
topic	Cryptography and Security Computer Vision and Pattern Recognition Machine Learning
url	https://arxiv.org/abs/2303.09079

Similar Items