:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Wenhui, Xu, Huiyu, Wang, Zhibo, Li, Zhichao, He, Zeqing, Wei, Xuelin, Ren, Kui
Format:	Preprint
Published:	2026
Subjects:	Cryptography and Security
Online Access:	https://arxiv.org/abs/2601.21380
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Can Small Language Models Reliably Resist Jailbreak Attacks? A Comprehensive Evaluation
by: Zhang, Wenhui, et al.
Published: (2025)

Interpretable LLM Guardrails via Sparse Representation Steering
by: He, Zeqing, et al.
Published: (2025)

JailbreakLens: Interpreting Jailbreak Mechanism in the Lens of Representation and Circuit
by: He, Zeqing, et al.
Published: (2024)

LoopTrap: Termination Poisoning Attacks on LLM Agents
by: Xu, Huiyu, et al.
Published: (2026)

PT-Mark: Invisible Watermarking for Text-to-image Diffusion Models via Semantic-aware Pivotal Tuning
by: Wang, Yaopeng, et al.
Published: (2025)

Dynamic Dual-level Defense Routing for Continual Adversarial Training
by: Wang, Wenxuan, et al.
Published: (2025)

Rerouting LLM Routers
by: Shafran, Avital, et al.
Published: (2025)

RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent
by: Xu, Huiyu, et al.
Published: (2024)

LoRA-Key: User-Centric LoRA Watermarking for Text-to-Image Diffusion Models
by: Wang, Yaopeng, et al.
Published: (2026)

Safeguarding LLM Embeddings in End-Cloud Collaboration via Entropy-Driven Perturbation
by: Jin, Shuaifan, et al.
Published: (2025)

CrossGuard: Safeguarding MLLMs against Joint-Modal Implicit Malicious Attacks
by: Zhang, Xu, et al.
Published: (2025)

The Communication-Friendly Privacy-Preserving Machine Learning against Malicious Adversaries
by: Lu, Tianpei, et al.
Published: (2024)

Reflect-Guard: Enhancing LLM Safeguards against Adversarial Prompts via Logical Self-Reflection
by: Lin, Lixing, et al.
Published: (2026)

Privacy Guard & Token Parsimony by Prompt and Context Handling and LLM Routing
by: Langiu, Alessio
Published: (2026)

RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents
by: Xiao, Wenjie, et al.
Published: (2026)

CircuitGuard: Mitigating LLM Memorization in RTL Code Generation Against IP Leakage
by: Mashnoor, Nowfel, et al.
Published: (2025)

SWAT: A System-Wide Approach to Tunable Leakage Mitigation in Encrypted Data Stores
by: Zheng, Leqian, et al.
Published: (2023)

CipherGuard: Compiler-aided Mitigation against Ciphertext Side-channel Attacks
by: Jiang, Ke, et al.
Published: (2025)

Explainer-guided Targeted Adversarial Attacks against Binary Code Similarity Detection Models
by: Chen, Mingjie, et al.
Published: (2025)

Sentra-Guard: A Real-Time Multilingual Defense Against Adversarial LLM Prompts
by: Hasan, Md. Mehedi, et al.
Published: (2025)

JailGuard: A Universal Detection Framework for LLM Prompt-based Attacks
by: Zhang, Xiaoyu, et al.
Published: (2023)

"Training robust watermarking model may hurt authentication!'' Exploring and Mitigating the Identity Leakage in Robust Watermarking
by: Zhang, Xinyu, et al.
Published: (2026)

MindGuard: Intrinsic Decision Inspection for Securing LLM Agents Against Metadata Poisoning
by: Wang, Zhiqiang, et al.
Published: (2025)

RTD-Guard: A Black-Box Textual Adversarial Detection Framework via Replacement Token Detection
by: Zhu, He, et al.
Published: (2026)

When Safe Models Merge into Danger: Exploiting Latent Vulnerabilities in LLM Fusion
by: Li, Jiaqing, et al.
Published: (2026)

AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations
by: He, Yu, et al.
Published: (2026)

LLM Security Guard for Code
by: Kavian, Arya, et al.
Published: (2024)

PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking Attacks
by: Shen, Guobin, et al.
Published: (2025)

WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents
by: Chen, Yulin, et al.
Published: (2026)

RouteScan: A Non-Intrusive Approach to Auditing MoE LLMs Safety via Expert Routing Telemetry
by: Lv, Bo, et al.
Published: (2026)

SAGE: Sample-Aware Guarding Engine for Robust Intrusion Detection Against Adversarial Attacks
by: Chen, Jing, et al.
Published: (2025)

GuardFS: a File System for Integrated Detection and Mitigation of Linux-based Ransomware
by: von der Assen, Jan, et al.
Published: (2024)

A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory
by: Wei, Qianshan, et al.
Published: (2025)

ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection
by: Zhao, Wei, et al.
Published: (2026)

Breaking Secure Aggregation: Label Leakage from Aggregated Gradients in Federated Learning
by: Wang, Zhibo, et al.
Published: (2024)

Enhancing Adversarial Attacks via Parameter Adaptive Adversarial Attack
by: Jin, Zhibo, et al.
Published: (2024)

DMS: Addressing Information Loss with More Steps for Pragmatic Adversarial Attacks
by: Zhu, Zhiyu, et al.
Published: (2024)

ExplainableGuard: Interpretable Adversarial Defense for Large Language Models Using Chain-of-Thought Reasoning
by: Guan, Shaowei, et al.
Published: (2025)

Grimlock: Guarding High-Agency Systems with eBPF and Attested Channels
by: Wu, Qiancheng, et al.
Published: (2026)

Adversarial Threat Vectors and Risk Mitigation for Retrieval-Augmented Generation Systems
by: Ward, Chris M., et al.
Published: (2025)