Saved in:
Bibliographic Details
Main Authors: Lyu, Weimin, Lin, Xiao, Zheng, Songzhu, Pang, Lu, Ling, Haibin, Jha, Susmit, Chen, Chao
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2403.17155
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913283249799168
author Lyu, Weimin
Lin, Xiao
Zheng, Songzhu
Pang, Lu
Ling, Haibin
Jha, Susmit
Chen, Chao
author_facet Lyu, Weimin
Lin, Xiao
Zheng, Songzhu
Pang, Lu
Ling, Haibin
Jha, Susmit
Chen, Chao
contents Textual backdoor attacks pose significant security threats. Current detection approaches, typically relying on intermediate feature representation or reconstructing potential triggers, are task-specific and less effective beyond sentence classification, struggling with tasks like question answering and named entity recognition. We introduce TABDet (Task-Agnostic Backdoor Detector), a pioneering task-agnostic method for backdoor detection. TABDet leverages final layer logits combined with an efficient pooling technique, enabling unified logit representation across three prominent NLP tasks. TABDet can jointly learn from diverse task-specific models, demonstrating superior detection efficacy over traditional task-specific methods.
format Preprint
id arxiv_https___arxiv_org_abs_2403_17155
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Task-Agnostic Detector for Insertion-Based Backdoor Attacks
Lyu, Weimin
Lin, Xiao
Zheng, Songzhu
Pang, Lu
Ling, Haibin
Jha, Susmit
Chen, Chao
Computation and Language
Cryptography and Security
Textual backdoor attacks pose significant security threats. Current detection approaches, typically relying on intermediate feature representation or reconstructing potential triggers, are task-specific and less effective beyond sentence classification, struggling with tasks like question answering and named entity recognition. We introduce TABDet (Task-Agnostic Backdoor Detector), a pioneering task-agnostic method for backdoor detection. TABDet leverages final layer logits combined with an efficient pooling technique, enabling unified logit representation across three prominent NLP tasks. TABDet can jointly learn from diverse task-specific models, demonstrating superior detection efficacy over traditional task-specific methods.
title Task-Agnostic Detector for Insertion-Based Backdoor Attacks
topic Computation and Language
Cryptography and Security
url https://arxiv.org/abs/2403.17155