Saved in:
| Main Authors: | Chen, Chiyu, Song, Xinhao, Chai, Yunkai, Yao, Yang, Zhao, Haodong, Li, Lijun, Li, Jie, Teng, Yan, Liu, Gongshen, Wang, Yingchun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.20333 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Revisiting Backdoor Threat in Federated Instruction Tuning from a Signal Aggregation Perspective
by: Zhao, Haodong, et al.
Published: (2026)
by: Zhao, Haodong, et al.
Published: (2026)
Environmental Injection Attacks against GUI Agents in Realistic Dynamic Environments
by: Zhang, Yitong, et al.
Published: (2025)
by: Zhang, Yitong, et al.
Published: (2025)
UOR: Universal Backdoor Attacks on Pre-trained Language Models
by: Du, Wei, et al.
Published: (2023)
by: Du, Wei, et al.
Published: (2023)
Patronus: Identifying and Mitigating Transferable Backdoors in Pre-trained Language Models
by: Zhao, Tianhang, et al.
Published: (2025)
by: Zhao, Tianhang, et al.
Published: (2025)
A Universal Identity Backdoor Attack against Speaker Verification based on Siamese Network
by: Zhao, Haodong, et al.
Published: (2023)
by: Zhao, Haodong, et al.
Published: (2023)
Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review
by: Cheng, Pengzhou, et al.
Published: (2023)
by: Cheng, Pengzhou, et al.
Published: (2023)
FedRS-Bench: Realistic Federated Learning Datasets and Benchmarks in Remote Sensing
by: Zhao, Haodong, et al.
Published: (2025)
by: Zhao, Haodong, et al.
Published: (2025)
SynGhost: Invisible and Universal Task-agnostic Backdoor Attack via Syntactic Transfer
by: Cheng, Pengzhou, et al.
Published: (2024)
by: Cheng, Pengzhou, et al.
Published: (2024)
VPI-Bench: Visual Prompt Injection Attacks for Computer-Use Agents
by: Cao, Tri, et al.
Published: (2025)
by: Cao, Tri, et al.
Published: (2025)
AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents
by: Debenedetti, Edoardo, et al.
Published: (2024)
by: Debenedetti, Edoardo, et al.
Published: (2024)
LaSM: Layer-wise Scaling Mechanism for Defending Pop-up Attack on GUI Agents
by: Yan, Zihe, et al.
Published: (2025)
by: Yan, Zihe, et al.
Published: (2025)
WAInjectBench: Benchmarking Prompt Injection Detections for Web Agents
by: Liu, Yinuo, et al.
Published: (2025)
by: Liu, Yinuo, et al.
Published: (2025)
Ghost in the Agent: Redefining Information Flow Tracking for LLM Agents
by: Cai, Yuandao, et al.
Published: (2026)
by: Cai, Yuandao, et al.
Published: (2026)
MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content
by: Guo, Ruoqi, et al.
Published: (2026)
by: Guo, Ruoqi, et al.
Published: (2026)
Physical and Software Based Fault Injection Attacks Against TEEs in Mobile Devices: A Systemisation of Knowledge
by: Joy, Aaron, et al.
Published: (2024)
by: Joy, Aaron, et al.
Published: (2024)
NSmark: Null Space Based Black-box Watermarking Defense Framework for Language Models
by: Zhao, Haodong, et al.
Published: (2024)
by: Zhao, Haodong, et al.
Published: (2024)
Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs
by: Chen, Yunhao, et al.
Published: (2025)
by: Chen, Yunhao, et al.
Published: (2025)
EmbTracker: Traceable Black-box Watermarking for Federated Language Models
by: Zhao, Haodong, et al.
Published: (2026)
by: Zhao, Haodong, et al.
Published: (2026)
EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage
by: Liao, Zeyi, et al.
Published: (2024)
by: Liao, Zeyi, et al.
Published: (2024)
Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack
by: Wang, Hao, et al.
Published: (2026)
by: Wang, Hao, et al.
Published: (2026)
LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments
by: Zhang, Chiyu, et al.
Published: (2026)
by: Zhang, Chiyu, et al.
Published: (2026)
OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs
by: Wang, Xin, et al.
Published: (2026)
by: Wang, Xin, et al.
Published: (2026)
Who Grants the Agent Power? Defending Against Instruction Injection via Task-Centric Access Control
by: Cai, Yifeng, et al.
Published: (2025)
by: Cai, Yifeng, et al.
Published: (2025)
SEARL: Joint Optimization of Policy and Tool Graph Memory for Self-Evolving Agents
by: Feng, Xinshun, et al.
Published: (2026)
by: Feng, Xinshun, et al.
Published: (2026)
MKF-ADS: Multi-Knowledge Fusion Based Self-supervised Anomaly Detection System for Control Area Network
by: Cheng, Pengzhou, et al.
Published: (2024)
by: Cheng, Pengzhou, et al.
Published: (2024)
ProtegoFed: Backdoor-Free Federated Instruction Tuning with Interspersed Poisoned Data
by: Zhao, Haodong, et al.
Published: (2026)
by: Zhao, Haodong, et al.
Published: (2026)
AgentWard: A Lifecycle Security Architecture for Autonomous AI Agents
by: Zhang, Yixiang, et al.
Published: (2026)
by: Zhang, Yixiang, et al.
Published: (2026)
Probing the Robustness of Large Language Models Safety to Latent Perturbations
by: Gu, Tianle, et al.
Published: (2025)
by: Gu, Tianle, et al.
Published: (2025)
AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations
by: He, Yu, et al.
Published: (2026)
by: He, Yu, et al.
Published: (2026)
Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection
by: Miao, Ziqi, et al.
Published: (2025)
by: Miao, Ziqi, et al.
Published: (2025)
StolenLoRA: Exploring LoRA Extraction Attacks via Synthetic Data
by: Wang, Yixu, et al.
Published: (2025)
by: Wang, Yixu, et al.
Published: (2025)
HarmfulSkillBench: How Do Harmful Skills Weaponize Your Agents?
by: Jiang, Yukun, et al.
Published: (2026)
by: Jiang, Yukun, et al.
Published: (2026)
Hunting the Ghost: Towards Automatic Mining of IoT Hidden Services
by: Dong, Shuaike, et al.
Published: (2025)
by: Dong, Shuaike, et al.
Published: (2025)
Ghost in the Transformer: Detecting Model Reuse with Invariant Spectral Signatures
by: Wang, Suqing, et al.
Published: (2025)
by: Wang, Suqing, et al.
Published: (2025)
Multimodal Instruction Disassembly with Covariate Shift Adaptation and Real-time Implementation
by: Bai, Yunkai, et al.
Published: (2024)
by: Bai, Yunkai, et al.
Published: (2024)
WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents
by: Chen, Yulin, et al.
Published: (2026)
by: Chen, Yulin, et al.
Published: (2026)
A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos
by: Yao, Yang, et al.
Published: (2025)
by: Yao, Yang, et al.
Published: (2025)
Collaborative Shadows: Distributed Backdoor Attacks in LLM-Based Multi-Agent Systems
by: Zhu, Pengyu, et al.
Published: (2025)
by: Zhu, Pengyu, et al.
Published: (2025)
The Landscape of Prompt Injection Threats in LLM Agents: From Taxonomy to Analysis
by: Wang, Peiran, et al.
Published: (2026)
by: Wang, Peiran, et al.
Published: (2026)
AgentTypo: Adaptive Typographic Prompt Injection Attacks against Black-box Multimodal Agents
by: Li, Yanjie, et al.
Published: (2025)
by: Li, Yanjie, et al.
Published: (2025)
Similar Items
-
Revisiting Backdoor Threat in Federated Instruction Tuning from a Signal Aggregation Perspective
by: Zhao, Haodong, et al.
Published: (2026) -
Environmental Injection Attacks against GUI Agents in Realistic Dynamic Environments
by: Zhang, Yitong, et al.
Published: (2025) -
UOR: Universal Backdoor Attacks on Pre-trained Language Models
by: Du, Wei, et al.
Published: (2023) -
Patronus: Identifying and Mitigating Transferable Backdoors in Pre-trained Language Models
by: Zhao, Tianhang, et al.
Published: (2025) -
A Universal Identity Backdoor Attack against Speaker Verification based on Siamese Network
by: Zhao, Haodong, et al.
Published: (2023)