Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.17155 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866913283249799168 |
|---|---|
| author | Lyu, Weimin Lin, Xiao Zheng, Songzhu Pang, Lu Ling, Haibin Jha, Susmit Chen, Chao |
| author_facet | Lyu, Weimin Lin, Xiao Zheng, Songzhu Pang, Lu Ling, Haibin Jha, Susmit Chen, Chao |
| contents | Textual backdoor attacks pose significant security threats. Current detection approaches, typically relying on intermediate feature representation or reconstructing potential triggers, are task-specific and less effective beyond sentence classification, struggling with tasks like question answering and named entity recognition. We introduce TABDet (Task-Agnostic Backdoor Detector), a pioneering task-agnostic method for backdoor detection. TABDet leverages final layer logits combined with an efficient pooling technique, enabling unified logit representation across three prominent NLP tasks. TABDet can jointly learn from diverse task-specific models, demonstrating superior detection efficacy over traditional task-specific methods. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2403_17155 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | Task-Agnostic Detector for Insertion-Based Backdoor Attacks Lyu, Weimin Lin, Xiao Zheng, Songzhu Pang, Lu Ling, Haibin Jha, Susmit Chen, Chao Computation and Language Cryptography and Security Textual backdoor attacks pose significant security threats. Current detection approaches, typically relying on intermediate feature representation or reconstructing potential triggers, are task-specific and less effective beyond sentence classification, struggling with tasks like question answering and named entity recognition. We introduce TABDet (Task-Agnostic Backdoor Detector), a pioneering task-agnostic method for backdoor detection. TABDet leverages final layer logits combined with an efficient pooling technique, enabling unified logit representation across three prominent NLP tasks. TABDet can jointly learn from diverse task-specific models, demonstrating superior detection efficacy over traditional task-specific methods. |
| title | Task-Agnostic Detector for Insertion-Based Backdoor Attacks |
| topic | Computation and Language Cryptography and Security |
| url | https://arxiv.org/abs/2403.17155 |