:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Luo, Weidi, Zhang, Qiming, Lu, Tianyu, Liu, Xiaogeng, Hu, Bin, Chiu, Hung-Chun, Ma, Siyuan, Zhang, Yizhe, Xiao, Xusheng, Cao, Yinzhi, Xiang, Zhen, Xiao, Chaowei
Format:	Preprint
Published:	2025
Subjects:	Cryptography and Security
Online Access:	https://arxiv.org/abs/2510.06607
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents
by: Li, Hao, et al.
Published: (2025)

JailBreakV: A Benchmark for Assessing the Robustness of MultiModal Large Language Models against Jailbreak Attacks
by: Luo, Weidi, et al.
Published: (2024)

MetaAgent: Automatically Constructing Multi-Agent Systems Based on Finite State Machines
by: Zhang, Yaolun, et al.
Published: (2025)

Automatic and Universal Prompt Injection Attacks against Large Language Models
by: Liu, Xiaogeng, et al.
Published: (2024)

AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection
by: Luo, Weidi, et al.
Published: (2025)

AutoDAN-Reasoning: Enhancing Strategies Exploration based Jailbreak Attacks with Test-Time Scaling
by: Liu, Xiaogeng, et al.
Published: (2025)

ROM: Real-time Overthinking Mitigation via Streaming Detection and Intervention
by: Wang, Xinyan, et al.
Published: (2026)

Doxing via the Lens: Revealing Location-related Privacy Leakage on Multi-modal Large Reasoning Models
by: Luo, Weidi, et al.
Published: (2025)

SafeGen-Bench: Benchmarking Safety in Image-Conditioned Text-to-Video Generation
by: Ma, Yingzi, et al.
Published: (2026)

OET: Optimization-based prompt injection Evaluation Toolkit
by: Pan, Jinsheng, et al.
Published: (2025)

RePD: Defending Jailbreak Attack through a Retrieval-based Prompt Decomposition Process
by: Wang, Peiran, et al.
Published: (2024)

Visual-RolePlay: Universal Jailbreak Attack on MultiModal Large Language Models via Role-playing Image Character
by: Ma, Siyuan, et al.
Published: (2024)

WIPI: A New Web Threat for LLM-Driven Web Agents
by: Wu, Fangzhou, et al.
Published: (2024)

AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models
by: Liu, Xiaogeng, et al.
Published: (2023)

When Are Teacher Tokens Reliable? Position-Weighted On-Policy Self-Distillation for Reasoning
by: Liu, Xiaogeng, et al.
Published: (2026)

AgentDyn: Are Your Agent Security Defenses Deployable in Real-World Dynamic Environments?
by: Li, Hao, et al.
Published: (2026)

AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting
by: Wang, Yu, et al.
Published: (2024)

AgentSys: Secure and Dynamic LLM Agents Through Explicit Hierarchical Memory Management
by: Wen, Ruoyao, et al.
Published: (2026)

Don't Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models
by: Yu, Zhiyuan, et al.
Published: (2024)

AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases
by: Chen, Zhaorun, et al.
Published: (2024)

Agent+P: Guiding UI Agents via Symbolic Planning
by: Ma, Shang, et al.
Published: (2025)

ChatNekoHacker: Real-Time Fan Engagement with Conversational Agents
by: Sera, Takuya, et al.
Published: (2025)

Cooking Up Risks: Benchmarking and Reducing Food Safety Risks in Large Language Models
by: Luo, Weidi, et al.
Published: (2026)

Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution
by: Zhang, Yechao, et al.
Published: (2026)

Energy-efficient Decentralized Learning via Graph Sparsification
by: Zhang, Xusheng, et al.
Published: (2024)

AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
by: Liu, Xiaogeng, et al.
Published: (2024)

HabitatAgent: An End-to-End Multi-Agent System for Housing Consultation
by: Yang, Hongyang, et al.
Published: (2026)

Mobile GUI Agents under Real-world Threats: Are We There Yet?
by: Liu, Guohong, et al.
Published: (2025)

Improving Segment Anything on the Fly: Auxiliary Online Learning and Adaptive Fusion for Medical Image Segmentation
by: Huang, Tianyu, et al.
Published: (2024)

FORTIS: Benchmarking Over-Privilege in Agent Skills
by: Li, Shawn, et al.
Published: (2026)

GrowthHacker: Automated Off-Policy Evaluation Optimization Using Code-Modifying LLM Agents
by: Wu, Jie JW, et al.
Published: (2025)

ProjDevBench: Benchmarking AI Coding Agents on End-to-End Project Development
by: Lu, Pengrui, et al.
Published: (2026)

Prototype2Code: End-to-end Front-end Code Generation from UI Design Prototypes
by: Xiao, Shuhong, et al.
Published: (2024)

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents
by: Chen, Zhaorun, et al.
Published: (2026)

Artificial Intelligence as the New Hacker: Developing Agents for Offensive Security
by: Valencia, Leroy Jacob
Published: (2024)

FunctionalAgent: Towards end-to-end on-top functional design
by: Chen, Yuhao, et al.
Published: (2026)

VISTA: An End-to-End Benchmark for Visual Spec-to-Web-App Coding Agents
by: Guo, JunJia, et al.
Published: (2026)

ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathologically Long Reasoning in Large Reasoning Models
by: Liu, Xiaogeng, et al.
Published: (2026)

AI-Trader: Benchmarking Autonomous Agents in Real-Time Financial Markets
by: Fan, Tianyu, et al.
Published: (2025)

HSCO-Bench: An Agent-Driven End-to-End Hardware-Software Co-design Benchmark for Systems-on-Chip
by: Tsai, Pei-Huan, et al.
Published: (2026)