Saved in:
| Main Authors: | Chen, Sizhe, Piet, Julien, Sitawarin, Chawin, Wagner, David |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.06363 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Defending Against Prompt Injection With a Few DefensiveTokens
by: Chen, Sizhe, et al.
Published: (2025)
by: Chen, Sizhe, et al.
Published: (2025)
Jatmo: Prompt Injection Defense by Task-Specific Finetuning
by: Piet, Julien, et al.
Published: (2023)
by: Piet, Julien, et al.
Published: (2023)
Defending Against Prompt Injection with DataFilter
by: Wang, Yizhu, et al.
Published: (2025)
by: Wang, Yizhu, et al.
Published: (2025)
Mark My Words: Analyzing and Evaluating Language Model Watermarks
by: Piet, Julien, et al.
Published: (2023)
by: Piet, Julien, et al.
Published: (2023)
PubDef: Defending Against Transfer Attacks From Public Models
by: Sitawarin, Chawin, et al.
Published: (2023)
by: Sitawarin, Chawin, et al.
Published: (2023)
SecAlign: Defending Against Prompt Injection with Preference Optimization
by: Chen, Sizhe, et al.
Published: (2024)
by: Chen, Sizhe, et al.
Published: (2024)
Lessons from Defending Gemini Against Indirect Prompt Injections
by: Shi, Chongyang, et al.
Published: (2025)
by: Shi, Chongyang, et al.
Published: (2025)
JailbreaksOverTime: Detecting Jailbreak Attacks Under Distribution Shift
by: Piet, Julien, et al.
Published: (2025)
by: Piet, Julien, et al.
Published: (2025)
Parser-Free Querying of Security Logs
by: Luo, Evan, et al.
Published: (2026)
by: Luo, Evan, et al.
Published: (2026)
Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks
by: Chen, Sizhe, et al.
Published: (2025)
by: Chen, Sizhe, et al.
Published: (2025)
PAL: Proxy-Guided Black-Box Attack on Large Language Models
by: Sitawarin, Chawin, et al.
Published: (2024)
by: Sitawarin, Chawin, et al.
Published: (2024)
The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections
by: Nasr, Milad, et al.
Published: (2025)
by: Nasr, Milad, et al.
Published: (2025)
Attention is All You Need to Defend Against Indirect Prompt Injection Attacks in LLMs
by: Zhong, Yinan, et al.
Published: (2025)
by: Zhong, Yinan, et al.
Published: (2025)
Soft Instruction De-escalation Defense
by: Walter, Nils Philipp, et al.
Published: (2025)
by: Walter, Nils Philipp, et al.
Published: (2025)
RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage
by: Zhong, Peter Yong, et al.
Published: (2025)
by: Zhong, Peter Yong, et al.
Published: (2025)
Defending Against Indirect Prompt Injection Attacks With Spotlighting
by: Hines, Keegan, et al.
Published: (2024)
by: Hines, Keegan, et al.
Published: (2024)
AgentVisor: Defending LLM Agents Against Prompt Injection via Semantic Virtualization
by: Ying, Zonghao, et al.
Published: (2026)
by: Ying, Zonghao, et al.
Published: (2026)
ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection
by: Weng, Shihao, et al.
Published: (2026)
by: Weng, Shihao, et al.
Published: (2026)
ARGUS: Defending Against Multimodal Indirect Prompt Injection via Steering Instruction-Following Behavior
by: Lu, Weikai, et al.
Published: (2025)
by: Lu, Weikai, et al.
Published: (2025)
GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks
by: Li, Rongchang, et al.
Published: (2024)
by: Li, Rongchang, et al.
Published: (2024)
Defending against Indirect Prompt Injection by Instruction Detection
by: Wen, Tongyu, et al.
Published: (2025)
by: Wen, Tongyu, et al.
Published: (2025)
PromptShield: Deployable Detection for Prompt Injection Attacks
by: Jacob, Dennis, et al.
Published: (2025)
by: Jacob, Dennis, et al.
Published: (2025)
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction
by: Chen, Yulin, et al.
Published: (2025)
by: Chen, Yulin, et al.
Published: (2025)
Defense Against Prompt Injection Attack by Leveraging Attack Techniques
by: Chen, Yulin, et al.
Published: (2024)
by: Chen, Yulin, et al.
Published: (2024)
zkStruDul: Programming zkSNARKs with Structural Duality
by: Krishnan, Rahul, et al.
Published: (2025)
by: Krishnan, Rahul, et al.
Published: (2025)
VIGIL: Defending LLM Agents Against Tool Stream Injection via Verify-Before-Commit
by: Lin, Junda, et al.
Published: (2026)
by: Lin, Junda, et al.
Published: (2026)
QueryIPI: Query-agnostic Indirect Prompt Injection on Coding Agents
by: Xie, Yuchong, et al.
Published: (2025)
by: Xie, Yuchong, et al.
Published: (2025)
Who Grants the Agent Power? Defending Against Instruction Injection via Task-Centric Access Control
by: Cai, Yifeng, et al.
Published: (2025)
by: Cai, Yifeng, et al.
Published: (2025)
StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by Large Language Models
by: Feng, Yang, et al.
Published: (2025)
by: Feng, Yang, et al.
Published: (2025)
The Task Shield: Enforcing Task Alignment to Defend Against Indirect Prompt Injection in LLM Agents
by: Jia, Feiran, et al.
Published: (2024)
by: Jia, Feiran, et al.
Published: (2024)
PlanGuard: Defending Agents against Indirect Prompt Injection via Planning-based Consistency Verification
by: Gong, Guangyu, et al.
Published: (2026)
by: Gong, Guangyu, et al.
Published: (2026)
AgentArmor: Enforcing Program Analysis on Agent Runtime Trace to Defend Against Prompt Injection
by: Wang, Peiran, et al.
Published: (2025)
by: Wang, Peiran, et al.
Published: (2025)
Strengthening Polymorphic Prompt Assembling: Dynamic Separator Generation Against Emerging Prompt Injection Attacks
by: Dorzhiev, Nima, et al.
Published: (2026)
by: Dorzhiev, Nima, et al.
Published: (2026)
Preventing Prompt Injection with Type-Directed Privilege Separation
by: Jacob, Dennis, et al.
Published: (2025)
by: Jacob, Dennis, et al.
Published: (2025)
To Protect the LLM Agent Against the Prompt Injection Attack with Polymorphic Prompt
by: Wang, Zhilong, et al.
Published: (2025)
by: Wang, Zhilong, et al.
Published: (2025)
DRIP: Defending Prompt Injection via Token-wise Representation Editing and Residual Instruction Fusion
by: Liu, Ruofan, et al.
Published: (2025)
by: Liu, Ruofan, et al.
Published: (2025)
Agent Privilege Separation in OpenClaw: A Structural Defense Against Prompt Injection
by: Cheng, Darren, et al.
Published: (2026)
by: Cheng, Darren, et al.
Published: (2026)
Analysis of LLMs Against Prompt Injection and Jailbreak Attacks
by: Jaiswal, Piyush, et al.
Published: (2026)
by: Jaiswal, Piyush, et al.
Published: (2026)
Securing AI Agents Against Prompt Injection Attacks
by: Ramakrishnan, Badrinath, et al.
Published: (2025)
by: Ramakrishnan, Badrinath, et al.
Published: (2025)
Semantic-Aware Parsing for Security Logs
by: Piet, Julien, et al.
Published: (2025)
by: Piet, Julien, et al.
Published: (2025)
Similar Items
-
Defending Against Prompt Injection With a Few DefensiveTokens
by: Chen, Sizhe, et al.
Published: (2025) -
Jatmo: Prompt Injection Defense by Task-Specific Finetuning
by: Piet, Julien, et al.
Published: (2023) -
Defending Against Prompt Injection with DataFilter
by: Wang, Yizhu, et al.
Published: (2025) -
Mark My Words: Analyzing and Evaluating Language Model Watermarks
by: Piet, Julien, et al.
Published: (2023) -
PubDef: Defending Against Transfer Attacks From Public Models
by: Sitawarin, Chawin, et al.
Published: (2023)