Saved in:
| Main Authors: | Luo, Weidi, Zhang, Qiming, Lu, Tianyu, Liu, Xiaogeng, Hu, Bin, Chiu, Hung-Chun, Ma, Siyuan, Zhang, Yizhe, Xiao, Xusheng, Cao, Yinzhi, Xiang, Zhen, Xiao, Chaowei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.06607 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents
by: Li, Hao, et al.
Published: (2025)
by: Li, Hao, et al.
Published: (2025)
JailBreakV: A Benchmark for Assessing the Robustness of MultiModal Large Language Models against Jailbreak Attacks
by: Luo, Weidi, et al.
Published: (2024)
by: Luo, Weidi, et al.
Published: (2024)
MetaAgent: Automatically Constructing Multi-Agent Systems Based on Finite State Machines
by: Zhang, Yaolun, et al.
Published: (2025)
by: Zhang, Yaolun, et al.
Published: (2025)
Automatic and Universal Prompt Injection Attacks against Large Language Models
by: Liu, Xiaogeng, et al.
Published: (2024)
by: Liu, Xiaogeng, et al.
Published: (2024)
AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection
by: Luo, Weidi, et al.
Published: (2025)
by: Luo, Weidi, et al.
Published: (2025)
AutoDAN-Reasoning: Enhancing Strategies Exploration based Jailbreak Attacks with Test-Time Scaling
by: Liu, Xiaogeng, et al.
Published: (2025)
by: Liu, Xiaogeng, et al.
Published: (2025)
ROM: Real-time Overthinking Mitigation via Streaming Detection and Intervention
by: Wang, Xinyan, et al.
Published: (2026)
by: Wang, Xinyan, et al.
Published: (2026)
Doxing via the Lens: Revealing Location-related Privacy Leakage on Multi-modal Large Reasoning Models
by: Luo, Weidi, et al.
Published: (2025)
by: Luo, Weidi, et al.
Published: (2025)
SafeGen-Bench: Benchmarking Safety in Image-Conditioned Text-to-Video Generation
by: Ma, Yingzi, et al.
Published: (2026)
by: Ma, Yingzi, et al.
Published: (2026)
OET: Optimization-based prompt injection Evaluation Toolkit
by: Pan, Jinsheng, et al.
Published: (2025)
by: Pan, Jinsheng, et al.
Published: (2025)
RePD: Defending Jailbreak Attack through a Retrieval-based Prompt Decomposition Process
by: Wang, Peiran, et al.
Published: (2024)
by: Wang, Peiran, et al.
Published: (2024)
Visual-RolePlay: Universal Jailbreak Attack on MultiModal Large Language Models via Role-playing Image Character
by: Ma, Siyuan, et al.
Published: (2024)
by: Ma, Siyuan, et al.
Published: (2024)
WIPI: A New Web Threat for LLM-Driven Web Agents
by: Wu, Fangzhou, et al.
Published: (2024)
by: Wu, Fangzhou, et al.
Published: (2024)
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models
by: Liu, Xiaogeng, et al.
Published: (2023)
by: Liu, Xiaogeng, et al.
Published: (2023)
When Are Teacher Tokens Reliable? Position-Weighted On-Policy Self-Distillation for Reasoning
by: Liu, Xiaogeng, et al.
Published: (2026)
by: Liu, Xiaogeng, et al.
Published: (2026)
AgentDyn: Are Your Agent Security Defenses Deployable in Real-World Dynamic Environments?
by: Li, Hao, et al.
Published: (2026)
by: Li, Hao, et al.
Published: (2026)
AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting
by: Wang, Yu, et al.
Published: (2024)
by: Wang, Yu, et al.
Published: (2024)
AgentSys: Secure and Dynamic LLM Agents Through Explicit Hierarchical Memory Management
by: Wen, Ruoyao, et al.
Published: (2026)
by: Wen, Ruoyao, et al.
Published: (2026)
Don't Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models
by: Yu, Zhiyuan, et al.
Published: (2024)
by: Yu, Zhiyuan, et al.
Published: (2024)
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases
by: Chen, Zhaorun, et al.
Published: (2024)
by: Chen, Zhaorun, et al.
Published: (2024)
Agent+P: Guiding UI Agents via Symbolic Planning
by: Ma, Shang, et al.
Published: (2025)
by: Ma, Shang, et al.
Published: (2025)
ChatNekoHacker: Real-Time Fan Engagement with Conversational Agents
by: Sera, Takuya, et al.
Published: (2025)
by: Sera, Takuya, et al.
Published: (2025)
Cooking Up Risks: Benchmarking and Reducing Food Safety Risks in Large Language Models
by: Luo, Weidi, et al.
Published: (2026)
by: Luo, Weidi, et al.
Published: (2026)
Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution
by: Zhang, Yechao, et al.
Published: (2026)
by: Zhang, Yechao, et al.
Published: (2026)
Energy-efficient Decentralized Learning via Graph Sparsification
by: Zhang, Xusheng, et al.
Published: (2024)
by: Zhang, Xusheng, et al.
Published: (2024)
AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
by: Liu, Xiaogeng, et al.
Published: (2024)
by: Liu, Xiaogeng, et al.
Published: (2024)
HabitatAgent: An End-to-End Multi-Agent System for Housing Consultation
by: Yang, Hongyang, et al.
Published: (2026)
by: Yang, Hongyang, et al.
Published: (2026)
Mobile GUI Agents under Real-world Threats: Are We There Yet?
by: Liu, Guohong, et al.
Published: (2025)
by: Liu, Guohong, et al.
Published: (2025)
Improving Segment Anything on the Fly: Auxiliary Online Learning and Adaptive Fusion for Medical Image Segmentation
by: Huang, Tianyu, et al.
Published: (2024)
by: Huang, Tianyu, et al.
Published: (2024)
FORTIS: Benchmarking Over-Privilege in Agent Skills
by: Li, Shawn, et al.
Published: (2026)
by: Li, Shawn, et al.
Published: (2026)
GrowthHacker: Automated Off-Policy Evaluation Optimization Using Code-Modifying LLM Agents
by: Wu, Jie JW, et al.
Published: (2025)
by: Wu, Jie JW, et al.
Published: (2025)
ProjDevBench: Benchmarking AI Coding Agents on End-to-End Project Development
by: Lu, Pengrui, et al.
Published: (2026)
by: Lu, Pengrui, et al.
Published: (2026)
Prototype2Code: End-to-end Front-end Code Generation from UI Design Prototypes
by: Xiao, Shuhong, et al.
Published: (2024)
by: Xiao, Shuhong, et al.
Published: (2024)
DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents
by: Chen, Zhaorun, et al.
Published: (2026)
by: Chen, Zhaorun, et al.
Published: (2026)
Artificial Intelligence as the New Hacker: Developing Agents for Offensive Security
by: Valencia, Leroy Jacob
Published: (2024)
by: Valencia, Leroy Jacob
Published: (2024)
FunctionalAgent: Towards end-to-end on-top functional design
by: Chen, Yuhao, et al.
Published: (2026)
by: Chen, Yuhao, et al.
Published: (2026)
VISTA: An End-to-End Benchmark for Visual Spec-to-Web-App Coding Agents
by: Guo, JunJia, et al.
Published: (2026)
by: Guo, JunJia, et al.
Published: (2026)
ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathologically Long Reasoning in Large Reasoning Models
by: Liu, Xiaogeng, et al.
Published: (2026)
by: Liu, Xiaogeng, et al.
Published: (2026)
AI-Trader: Benchmarking Autonomous Agents in Real-Time Financial Markets
by: Fan, Tianyu, et al.
Published: (2025)
by: Fan, Tianyu, et al.
Published: (2025)
HSCO-Bench: An Agent-Driven End-to-End Hardware-Software Co-design Benchmark for Systems-on-Chip
by: Tsai, Pei-Huan, et al.
Published: (2026)
by: Tsai, Pei-Huan, et al.
Published: (2026)
Similar Items
-
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents
by: Li, Hao, et al.
Published: (2025) -
JailBreakV: A Benchmark for Assessing the Robustness of MultiModal Large Language Models against Jailbreak Attacks
by: Luo, Weidi, et al.
Published: (2024) -
MetaAgent: Automatically Constructing Multi-Agent Systems Based on Finite State Machines
by: Zhang, Yaolun, et al.
Published: (2025) -
Automatic and Universal Prompt Injection Attacks against Large Language Models
by: Liu, Xiaogeng, et al.
Published: (2024) -
AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection
by: Luo, Weidi, et al.
Published: (2025)