:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Yihao, Wang, Kai, Wu, Jiangrong, Wu, Haolin, Zhou, Yuxuan, Wei, Zeming, Wu, Dongxian, Chen, Xun, Sun, Jun, Sun, Meng
Format:	Preprint
Published:	2026
Subjects:	Cryptography and Security Artificial Intelligence Computation and Language Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2604.11309
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ClawWorm: Self-Propagating Attacks Across LLM Agent Ecosystems
by: Zhang, Yihao, et al.
Published: (2026)

MILE: A Mutation Testing Framework of In-Context Learning Systems
by: Wei, Zeming, et al.
Published: (2024)

Adversarial Representation Engineering: A General Model Editing Framework for Large Language Models
by: Zhang, Yihao, et al.
Published: (2024)

Secure LLM Fine-Tuning via Safety-Aware Probing
by: Wu, Chengcan, et al.
Published: (2025)

RACC: Representation-Aware Coverage Criteria for LLM Safety Testing
by: Wei, Zeming, et al.
Published: (2026)

Automata-Based Steering of Large Language Models for Diverse Structured Generation
by: Luan, Xiaokun, et al.
Published: (2025)

Boosting Jailbreak Attack with Momentum
by: Zhang, Yihao, et al.
Published: (2024)

When Thinking LLMs Lie: Unveiling the Strategic Deception in Representations of Reasoning Models
by: Wang, Kai, et al.
Published: (2025)

Dynamic Orthogonal Continual Fine-tuning for Mitigating Catastrophic Forgettings
by: Zhang, Zhixin, et al.
Published: (2025)

ReGA: Model-Based Safeguard for LLMs via Representation-Guided Abstraction
by: Wei, Zeming, et al.
Published: (2025)

Control at Stake: Evaluating the Security Landscape of LLM-Driven Email Agents
by: Wu, Jiangrong, et al.
Published: (2025)

Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents
by: Yang, Wenkai, et al.
Published: (2024)

ExCyTIn-Bench: Evaluating LLM agents on Cyber Threat Investigation
by: Wu, Yiran, et al.
Published: (2025)

OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation
by: Jia, Xiaojun, et al.
Published: (2025)

Conversations Risk Detection LLMs in Financial Agents via Multi-Stage Generative Rollout
by: Jiang, Xiaotong, et al.
Published: (2026)

Exploring the Robustness of In-Context Learning with Noisy Labels
by: Cheng, Chen, et al.
Published: (2024)

ChainFuzzer: Greybox Fuzzing for Workflow-Level Multi-Tool Vulnerabilities in LLM Agents
by: Wu, Jiangrong, et al.
Published: (2026)

Calibrated Adversarial Sampling: Multi-Armed Bandit-Guided Generalization Against Unforeseen Attacks
by: Wang, Rui, et al.
Published: (2025)

The Landscape of Prompt Injection Threats in LLM Agents: From Taxonomy to Analysis
by: Wang, Peiran, et al.
Published: (2026)

Security Attacks on LLM-based Code Completion Tools
by: Cheng, Wen, et al.
Published: (2024)

LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification
by: Tang, Fengxiao, et al.
Published: (2025)

RoMA: Robust Malware Attribution via Byte-level Adversarial Training with Global Perturbations and Adversarial Consistency Regularization
by: Sun, Yuxia, et al.
Published: (2025)

From Retrieval to Reasoning: A Framework for Cyber Threat Intelligence NER with Explicit and Adaptive Instructions
by: Peng, Jiaren, et al.
Published: (2025)

Exploit the Leak: Understanding Risks in Biometric Matchers
by: Durbet, Axel, et al.
Published: (2023)

Securing Multi-Agent Systems Against Corruptions via Node Contribution Backpropagation
by: Wu, Chengcan, et al.
Published: (2025)

The Obvious Invisible Threat: LLM-Powered GUI Agents' Vulnerability to Fine-Print Injections
by: Chen, Chaoran, et al.
Published: (2025)

Sugar-Coated Poison: Benign Generation Unlocks LLM Jailbreaking
by: Wu, Yu-Hang, et al.
Published: (2025)

Self-Disguise Attack: Induce the LLM to disguise itself for AIGT detection evasion
by: Zhou, Yinghan, et al.
Published: (2025)

HackWorld: Evaluating Computer-Use Agents on Exploiting Web Application Vulnerabilities
by: Ren, Xiaoxue, et al.
Published: (2025)

Is the Digital Forensics and Incident Response Pipeline Ready for Text-Based Threats in LLM Era?
by: Bhandarkar, Avanti, et al.
Published: (2024)

Resource Consumption Threats in Large Language Models
by: Zhang, Yuanhe, et al.
Published: (2026)

Rubrics as an Attack Surface: Stealthy Preference Drift in LLM Judges
by: Ding, Ruomeng, et al.
Published: (2026)

On the Hidden Costs of Counterfactual Knowledge Training in LLM Unlearning
by: Ye, Xiaotian, et al.
Published: (2026)

MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs
by: Wen, Rui, et al.
Published: (2026)

BraveGuard: From Open-World Threats to Safer Computer-Use Agents
by: Feng, Yunhao, et al.
Published: (2026)

From Perception to Protection: A Developer-Centered Study of Security and Privacy Threats in Extended Reality (XR)
by: Cai, Kunlin, et al.
Published: (2025)

Can LLM Infer Risk Information From MCP Server System Logs?
by: Fu, Jiayi, et al.
Published: (2025)

RAPO: Risk-Aware Preference Optimization for Generalizable Safe Reasoning
by: Wei, Zeming, et al.
Published: (2026)

Generalized Security-Preserving Refinement for Concurrent Systems
by: Sun, Huan, et al.
Published: (2025)

Jatmo: Prompt Injection Defense by Task-Specific Finetuning
by: Piet, Julien, et al.
Published: (2023)