Saved in:
| Main Authors: | Lyu, Weimin, Lin, Xiao, Zheng, Songzhu, Pang, Lu, Ling, Haibin, Jha, Susmit, Chen, Chao |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.17155 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations
by: Pang, Lu, et al.
Published: (2024)
by: Pang, Lu, et al.
Published: (2024)
Test-Time Backdoor Attacks on Multimodal Large Language Models
by: Lu, Dong, et al.
Published: (2024)
by: Lu, Dong, et al.
Published: (2024)
CAPAA: Classifier-Agnostic Projector-Based Adversarial Attack
by: Li, Zhan, et al.
Published: (2025)
by: Li, Zhan, et al.
Published: (2025)
MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs
by: Wen, Rui, et al.
Published: (2026)
by: Wen, Rui, et al.
Published: (2026)
Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
by: Wang, Jiongxiao, et al.
Published: (2024)
by: Wang, Jiongxiao, et al.
Published: (2024)
SynGhost: Invisible and Universal Task-agnostic Backdoor Attack via Syntactic Transfer
by: Cheng, Pengzhou, et al.
Published: (2024)
by: Cheng, Pengzhou, et al.
Published: (2024)
Denial-of-Service Poisoning Attacks against Large Language Models
by: Gao, Kuofeng, et al.
Published: (2024)
by: Gao, Kuofeng, et al.
Published: (2024)
Data Extraction Attacks in Retrieval-Augmented Generation via Backdoors
by: Peng, Yuefeng, et al.
Published: (2024)
by: Peng, Yuefeng, et al.
Published: (2024)
Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning
by: Zhao, Shuai, et al.
Published: (2024)
by: Zhao, Shuai, et al.
Published: (2024)
BadApex: Backdoor Attack Based on Adaptive Optimization Mechanism of Black-box Large Language Models
by: Wu, Zhengxian, et al.
Published: (2025)
by: Wu, Zhengxian, et al.
Published: (2025)
BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models
by: Wang, Zihan, et al.
Published: (2025)
by: Wang, Zihan, et al.
Published: (2025)
SteganoBackdoor: Stealthy and Data-Efficient Backdoor Attacks on Language Models
by: Xue, Eric, et al.
Published: (2025)
by: Xue, Eric, et al.
Published: (2025)
Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models
by: Jiang, Peihai, et al.
Published: (2025)
by: Jiang, Peihai, et al.
Published: (2025)
Stealthy Backdoor Attacks against LLMs Based on Natural Style Triggers
by: Wei, Jiali, et al.
Published: (2026)
by: Wei, Jiali, et al.
Published: (2026)
Large Language Models are Good Attackers: Efficient and Stealthy Textual Backdoor Attacks
by: Li, Ziqiang, et al.
Published: (2024)
by: Li, Ziqiang, et al.
Published: (2024)
TuBA: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning
by: He, Xuanli, et al.
Published: (2024)
by: He, Xuanli, et al.
Published: (2024)
BadFair: Backdoored Fairness Attacks with Group-conditioned Triggers
by: Xue, Jiaqi, et al.
Published: (2024)
by: Xue, Jiaqi, et al.
Published: (2024)
Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors
by: Wang, Tianchun, et al.
Published: (2024)
by: Wang, Tianchun, et al.
Published: (2024)
SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks
by: He, Xuanli, et al.
Published: (2024)
by: He, Xuanli, et al.
Published: (2024)
Model-Agnostic Lifelong LLM Safety via Externalized Attack-Defense Co-Evolution
by: Zhang, Xiaozhe, et al.
Published: (2026)
by: Zhang, Xiaozhe, et al.
Published: (2026)
Concept-Guided Backdoor Attack on Vision Language Models
by: Shen, Haoyu, et al.
Published: (2025)
by: Shen, Haoyu, et al.
Published: (2025)
Simulate and Eliminate: Revoke Backdoors for Generative Large Language Models
by: Li, Haoran, et al.
Published: (2024)
by: Li, Haoran, et al.
Published: (2024)
Claim-Guided Textual Backdoor Attack for Practical Applications
by: Song, Minkyoo, et al.
Published: (2024)
by: Song, Minkyoo, et al.
Published: (2024)
Composite Backdoor Attacks Against Large Language Models
by: Huang, Hai, et al.
Published: (2023)
by: Huang, Hai, et al.
Published: (2023)
Privacy Preserving In-Context-Learning Framework for Large Language Models
by: Bhusal, Bishnu, et al.
Published: (2025)
by: Bhusal, Bishnu, et al.
Published: (2025)
Breaking PEFT Limitations: Leveraging Weak-to-Strong Knowledge Transfer for Backdoor Attacks in LLMs
by: Zhao, Shuai, et al.
Published: (2024)
by: Zhao, Shuai, et al.
Published: (2024)
Phantom: General Backdoor Attacks on Retrieval Augmented Language Generation
by: Chaudhari, Harsh, et al.
Published: (2024)
by: Chaudhari, Harsh, et al.
Published: (2024)
Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation
by: Zhao, Shuai, et al.
Published: (2024)
by: Zhao, Shuai, et al.
Published: (2024)
UOR: Universal Backdoor Attacks on Pre-trained Language Models
by: Du, Wei, et al.
Published: (2023)
by: Du, Wei, et al.
Published: (2023)
Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents
by: Yang, Wenkai, et al.
Published: (2024)
by: Yang, Wenkai, et al.
Published: (2024)
Universal Vulnerabilities in Large Language Models: Backdoor Attacks for In-context Learning
by: Zhao, Shuai, et al.
Published: (2024)
by: Zhao, Shuai, et al.
Published: (2024)
A Survey of Recent Backdoor Attacks and Defenses in Large Language Models
by: Zhao, Shuai, et al.
Published: (2024)
by: Zhao, Shuai, et al.
Published: (2024)
PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails
by: Mangaokar, Neal, et al.
Published: (2024)
by: Mangaokar, Neal, et al.
Published: (2024)
IAG: Input-aware Backdoor Attack on VLM-based Visual Grounding
by: Li, Junxian, et al.
Published: (2025)
by: Li, Junxian, et al.
Published: (2025)
GradEscape: A Gradient-Based Evader Against AI-Generated Text Detectors
by: Meng, Wenlong, et al.
Published: (2025)
by: Meng, Wenlong, et al.
Published: (2025)
MBTSAD: Mitigating Backdoors in Language Models Based on Token Splitting and Attention Distillation
by: Ding, Yidong, et al.
Published: (2025)
by: Ding, Yidong, et al.
Published: (2025)
Harnessing Task Overload for Scalable Jailbreak Attacks on Large Language Models
by: Dong, Yiting, et al.
Published: (2024)
by: Dong, Yiting, et al.
Published: (2024)
BadActs: A Universal Backdoor Defense in the Activation Space
by: Yi, Biao, et al.
Published: (2024)
by: Yi, Biao, et al.
Published: (2024)
MGTEVAL: An Interactive Platform for Systemtic Evaluation of Machine-Generated Text Detectors
by: Li, Yuanfan, et al.
Published: (2026)
by: Li, Yuanfan, et al.
Published: (2026)
StructuralSleight: Automated Jailbreak Attacks on Large Language Models Utilizing Uncommon Text-Organization Structures
by: Li, Bangxin, et al.
Published: (2024)
by: Li, Bangxin, et al.
Published: (2024)
Similar Items
-
Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations
by: Pang, Lu, et al.
Published: (2024) -
Test-Time Backdoor Attacks on Multimodal Large Language Models
by: Lu, Dong, et al.
Published: (2024) -
CAPAA: Classifier-Agnostic Projector-Based Adversarial Attack
by: Li, Zhan, et al.
Published: (2025) -
MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs
by: Wen, Rui, et al.
Published: (2026) -
Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
by: Wang, Jiongxiao, et al.
Published: (2024)