:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lyu, Weimin, Lin, Xiao, Zheng, Songzhu, Pang, Lu, Ling, Haibin, Jha, Susmit, Chen, Chao
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Cryptography and Security
Online Access:	https://arxiv.org/abs/2403.17155
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations
by: Pang, Lu, et al.
Published: (2024)

Test-Time Backdoor Attacks on Multimodal Large Language Models
by: Lu, Dong, et al.
Published: (2024)

CAPAA: Classifier-Agnostic Projector-Based Adversarial Attack
by: Li, Zhan, et al.
Published: (2025)

MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs
by: Wen, Rui, et al.
Published: (2026)

Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
by: Wang, Jiongxiao, et al.
Published: (2024)

SynGhost: Invisible and Universal Task-agnostic Backdoor Attack via Syntactic Transfer
by: Cheng, Pengzhou, et al.
Published: (2024)

Denial-of-Service Poisoning Attacks against Large Language Models
by: Gao, Kuofeng, et al.
Published: (2024)

Data Extraction Attacks in Retrieval-Augmented Generation via Backdoors
by: Peng, Yuefeng, et al.
Published: (2024)

Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning
by: Zhao, Shuai, et al.
Published: (2024)

BadApex: Backdoor Attack Based on Adaptive Optimization Mechanism of Black-box Large Language Models
by: Wu, Zhengxian, et al.
Published: (2025)

BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models
by: Wang, Zihan, et al.
Published: (2025)

SteganoBackdoor: Stealthy and Data-Efficient Backdoor Attacks on Language Models
by: Xue, Eric, et al.
Published: (2025)

Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models
by: Jiang, Peihai, et al.
Published: (2025)

Stealthy Backdoor Attacks against LLMs Based on Natural Style Triggers
by: Wei, Jiali, et al.
Published: (2026)

Large Language Models are Good Attackers: Efficient and Stealthy Textual Backdoor Attacks
by: Li, Ziqiang, et al.
Published: (2024)

TuBA: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning
by: He, Xuanli, et al.
Published: (2024)

BadFair: Backdoored Fairness Attacks with Group-conditioned Triggers
by: Xue, Jiaqi, et al.
Published: (2024)

Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors
by: Wang, Tianchun, et al.
Published: (2024)

SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks
by: He, Xuanli, et al.
Published: (2024)

Model-Agnostic Lifelong LLM Safety via Externalized Attack-Defense Co-Evolution
by: Zhang, Xiaozhe, et al.
Published: (2026)

Concept-Guided Backdoor Attack on Vision Language Models
by: Shen, Haoyu, et al.
Published: (2025)

Simulate and Eliminate: Revoke Backdoors for Generative Large Language Models
by: Li, Haoran, et al.
Published: (2024)

Claim-Guided Textual Backdoor Attack for Practical Applications
by: Song, Minkyoo, et al.
Published: (2024)

Composite Backdoor Attacks Against Large Language Models
by: Huang, Hai, et al.
Published: (2023)

Privacy Preserving In-Context-Learning Framework for Large Language Models
by: Bhusal, Bishnu, et al.
Published: (2025)

Breaking PEFT Limitations: Leveraging Weak-to-Strong Knowledge Transfer for Backdoor Attacks in LLMs
by: Zhao, Shuai, et al.
Published: (2024)

Phantom: General Backdoor Attacks on Retrieval Augmented Language Generation
by: Chaudhari, Harsh, et al.
Published: (2024)

Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation
by: Zhao, Shuai, et al.
Published: (2024)

UOR: Universal Backdoor Attacks on Pre-trained Language Models
by: Du, Wei, et al.
Published: (2023)

Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents
by: Yang, Wenkai, et al.
Published: (2024)

Universal Vulnerabilities in Large Language Models: Backdoor Attacks for In-context Learning
by: Zhao, Shuai, et al.
Published: (2024)

A Survey of Recent Backdoor Attacks and Defenses in Large Language Models
by: Zhao, Shuai, et al.
Published: (2024)

PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails
by: Mangaokar, Neal, et al.
Published: (2024)

IAG: Input-aware Backdoor Attack on VLM-based Visual Grounding
by: Li, Junxian, et al.
Published: (2025)

GradEscape: A Gradient-Based Evader Against AI-Generated Text Detectors
by: Meng, Wenlong, et al.
Published: (2025)

MBTSAD: Mitigating Backdoors in Language Models Based on Token Splitting and Attention Distillation
by: Ding, Yidong, et al.
Published: (2025)

Harnessing Task Overload for Scalable Jailbreak Attacks on Large Language Models
by: Dong, Yiting, et al.
Published: (2024)

BadActs: A Universal Backdoor Defense in the Activation Space
by: Yi, Biao, et al.
Published: (2024)

MGTEVAL: An Interactive Platform for Systemtic Evaluation of Machine-Generated Text Detectors
by: Li, Yuanfan, et al.
Published: (2026)

StructuralSleight: Automated Jailbreak Attacks on Large Language Models Utilizing Uncommon Text-Organization Structures
by: Li, Bangxin, et al.
Published: (2024)