:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Shuhao, Li, Jiarui, Cao, Qi, Zhang, Ruiyi, Xie, Pengtao
Format:	Preprint
Published:	2026
Subjects:	Cryptography and Security Machine Learning
Online Access:	https://arxiv.org/abs/2605.30837
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SteganoBackdoor: Stealthy and Data-Efficient Backdoor Attacks on Language Models
by: Xue, Eric, et al.
Published: (2025)

Models Under SCOPE: Scalable and Controllable Routing via Pre-hoc Reasoning
by: Cao, Qi, et al.
Published: (2026)

Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks on LLM Agents
by: Zhan, Qiusi, et al.
Published: (2025)

The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections
by: Nasr, Milad, et al.
Published: (2025)

RedVisor: Reasoning-Aware Prompt Injection Defense via Zero-Copy KV Cache Reuse
by: Liu, Mingrui, et al.
Published: (2026)

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents
by: Debenedetti, Edoardo, et al.
Published: (2024)

PISmith: Reinforcement Learning-based Red Teaming for Prompt Injection Defenses
by: Yin, Chenlong, et al.
Published: (2026)

A Multi-Agent LLM Defense Pipeline Against Prompt Injection Attacks
by: Hossain, S M Asif, et al.
Published: (2025)

Formalizing and Benchmarking Prompt Injection Attacks and Defenses
by: Liu, Yupei, et al.
Published: (2023)

Early Approaches to Adversarial Fine-Tuning for Prompt Injection Defense: A 2022 Study of GPT-3 and Contemporary Models
by: Sandoval, Gustavo, et al.
Published: (2025)

LeakSealer: A Semisupervised Defense for LLMs Against Prompt Injection and Leakage Attacks
by: Panebianco, Francesco, et al.
Published: (2025)

Evaluating Prompt Injection Defenses for Educational LLM Tutors: Security-Usability-Latency Trade-offs
by: Maiorano, Alexandre Cristovão
Published: (2026)

FilterFL: Knowledge Filtering-based Data-Free Backdoor Defense for Federated Learning
by: Yang, Yanxin, et al.
Published: (2023)

IDEA: Invariant Defense for Graph Adversarial Robustness
by: Tao, Shuchang, et al.
Published: (2023)

Checkpoint-GCG: Auditing and Attacking Fine-Tuning-Based Prompt Injection Defenses
by: Yang, Xiaoxue, et al.
Published: (2025)

Preventing Prompt Injection with Type-Directed Privilege Separation
by: Jacob, Dennis, et al.
Published: (2025)

Mitigating Indirect Prompt Injection via Instruction-Following Intent Analysis
by: Kang, Mintong, et al.
Published: (2025)

Learning to Look Benign: Targeted Evasion of Malware Detectors via API Import Injection
by: Dautartas, Juozas, et al.
Published: (2026)

SecAlign: Defending Against Prompt Injection with Preference Optimization
by: Chen, Sizhe, et al.
Published: (2024)

Design Patterns for Securing LLM Agents against Prompt Injections
by: Beurer-Kellner, Luca, et al.
Published: (2025)

Lessons from Defending Gemini Against Indirect Prompt Injections
by: Shi, Chongyang, et al.
Published: (2025)

EvoMail: Self-Evolving Cognitive Agents for Adaptive Spam and Phishing Email Defense
by: Huang, Wei, et al.
Published: (2025)

Dashed Line Defense: Plug-And-Play Defense Against Adaptive Score-Based Query Attacks
by: Fu, Yanzhang, et al.
Published: (2026)

BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents
by: Zhang, Kaiyuan, et al.
Published: (2025)

PIShield: Detecting Prompt Injection Attacks via Intrinsic LLM Features
by: Zou, Wei, et al.
Published: (2025)

Certified Defense on the Fairness of Graph Neural Networks
by: Dong, Yushun, et al.
Published: (2023)

How Not to Detect Prompt Injections with an LLM
by: Choudhary, Sarthak, et al.
Published: (2025)

Neural Exec: Learning (and Learning from) Execution Triggers for Prompt Injection Attacks
by: Pasquini, Dario, et al.
Published: (2024)

Adaptive Deception Framework with Behavioral Analysis for Enhanced Cybersecurity Defense
by: AL-Zahrani, Basil Abdullah
Published: (2025)

GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks
by: Li, Rongchang, et al.
Published: (2024)

A Systematic Literature Review on LLM Defenses Against Prompt Injection and Jailbreaking: Expanding NIST Taxonomy
by: Correia, Pedro H. Barcha, et al.
Published: (2026)

Game of Trojans: Adaptive Adversaries Against Output-based Trojaned-Model Detectors
by: Sahabandu, Dinuka, et al.
Published: (2024)

Backdoored Retrievers for Prompt Injection Attacks on Retrieval Augmented Generation of Large Language Models
by: Clop, Cody, et al.
Published: (2024)

ARBoids: Adaptive Residual Reinforcement Learning With Boids Model for Cooperative Multi-USV Target Defense
by: Tao, Jiyue, et al.
Published: (2025)

Combining Machine Learning Defenses without Conflicts
by: Duddu, Vasisht, et al.
Published: (2024)

Evaluations of Machine Learning Privacy Defenses are Misleading
by: Aerni, Michael, et al.
Published: (2024)

Data Reconstruction Attacks and Defenses: A Systematic Evaluation
by: Liu, Sheng, et al.
Published: (2024)

BeniFul: Backdoor Defense via Middle Feature Analysis for Deep Neural Networks
by: Li, Xinfu, et al.
Published: (2024)

Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models
by: Huo, Mingjia, et al.
Published: (2024)

Robustness Inspired Graph Backdoor Defense
by: Zhang, Zhiwei, et al.
Published: (2024)