Saved in:
| Main Authors: | Lee, Taegyeong, Yoo, Jeonghwa, Cho, Hyoungseo, Kim, Soo Yong, Maeng, Yunho |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.12299 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AgentGuard: Repurposing Agentic Orchestrator for Safety Evaluation of Tool Orchestration
by: Chen, Jizhou, et al.
Published: (2025)
by: Chen, Jizhou, et al.
Published: (2025)
CrypTorch: PyTorch-based Auto-tuning Compiler for Machine Learning with Multi-party Computation
by: Liu, Jinyu, et al.
Published: (2025)
by: Liu, Jinyu, et al.
Published: (2025)
AdaptiveGuard: Towards Adaptive Runtime Safety for LLM-Powered Software
by: Yang, Rui, et al.
Published: (2025)
by: Yang, Rui, et al.
Published: (2025)
MultiPhishGuard: An Explainable and Adaptive Multi-Agent LLM System for Phishing Email Detection
by: Xue, Yinuo, et al.
Published: (2025)
by: Xue, Yinuo, et al.
Published: (2025)
GuardReasoner: Towards Reasoning-based LLM Safeguards
by: Liu, Yue, et al.
Published: (2025)
by: Liu, Yue, et al.
Published: (2025)
Benchmarking Large Language Models for Zero-shot and Few-shot Phishing URL Detection
by: Hasan, Najmul, et al.
Published: (2026)
by: Hasan, Najmul, et al.
Published: (2026)
When Safety Geometry Collapses: Fine-Tuning Vulnerabilities in Agentic Guard Models
by: Hossain, Ismail, et al.
Published: (2026)
by: Hossain, Ismail, et al.
Published: (2026)
X-Guard: Multilingual Guard Agent for Content Moderation
by: Upadhayay, Bibek, et al.
Published: (2025)
by: Upadhayay, Bibek, et al.
Published: (2025)
Privacy Guard & Token Parsimony by Prompt and Context Handling and LLM Routing
by: Langiu, Alessio
Published: (2026)
by: Langiu, Alessio
Published: (2026)
RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents
by: Xiao, Wenjie, et al.
Published: (2026)
by: Xiao, Wenjie, et al.
Published: (2026)
SafeMLRM: Demystifying Safety in Multi-modal Large Reasoning Models
by: Fang, Junfeng, et al.
Published: (2025)
by: Fang, Junfeng, et al.
Published: (2025)
Sentra-Guard: A Real-Time Multilingual Defense Against Adversarial LLM Prompts
by: Hasan, Md. Mehedi, et al.
Published: (2025)
by: Hasan, Md. Mehedi, et al.
Published: (2025)
A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory
by: Wei, Qianshan, et al.
Published: (2025)
by: Wei, Qianshan, et al.
Published: (2025)
GuardReasoner-Omni: A Reasoning-based Multi-modal Guardrail for Text, Image, Video, and Audio
by: Zhu, Zhenhao, et al.
Published: (2026)
by: Zhu, Zhenhao, et al.
Published: (2026)
VeriGuard: Enhancing LLM Agent Safety via Verified Code Generation
by: Miculicich, Lesly, et al.
Published: (2025)
by: Miculicich, Lesly, et al.
Published: (2025)
Reflect-Guard: Enhancing LLM Safeguards against Adversarial Prompts via Logical Self-Reflection
by: Lin, Lixing, et al.
Published: (2026)
by: Lin, Lixing, et al.
Published: (2026)
DistillGuard: Evaluating Defenses Against LLM Knowledge Distillation
by: Jiang, Bo
Published: (2026)
by: Jiang, Bo
Published: (2026)
TraceGuard: Structured Multi-Dimensional Monitoring as a Collusion-Resistant Control Protocol
by: Nguyen, Khanh Linh, et al.
Published: (2026)
by: Nguyen, Khanh Linh, et al.
Published: (2026)
GLiNER Guard: Unified Encoder Family for Production LLM Safety and Privacy
by: Minko, Bogdan, et al.
Published: (2026)
by: Minko, Bogdan, et al.
Published: (2026)
ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection
by: Zhao, Wei, et al.
Published: (2026)
by: Zhao, Wei, et al.
Published: (2026)
CoopGuard: Stateful Cooperative Agents Safeguarding LLMs Against Evolving Multi-Round Attacks
by: Li, Siyuan, et al.
Published: (2026)
by: Li, Siyuan, et al.
Published: (2026)
Seven Security Challenges That Must be Solved in Cross-domain Multi-agent LLM Systems
by: Ko, Ronny, et al.
Published: (2025)
by: Ko, Ronny, et al.
Published: (2025)
Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
by: Ghosal, Soumya Suvra, et al.
Published: (2024)
by: Ghosal, Soumya Suvra, et al.
Published: (2024)
CoT-Guard: Small Models for Strong Monitoring
by: Diwan, Nirav, et al.
Published: (2026)
by: Diwan, Nirav, et al.
Published: (2026)
LLM-Safety Evaluations Lack Robustness
by: Beyer, Tim, et al.
Published: (2025)
by: Beyer, Tim, et al.
Published: (2025)
3D Guard-Layer: An Integrated Agentic AI Safety System for Edge Artificial Intelligence
by: Kurshan, Eren, et al.
Published: (2025)
by: Kurshan, Eren, et al.
Published: (2025)
SGuard-v1: Safety Guardrail for Large Language Models
by: Lee, JoonHo, et al.
Published: (2025)
by: Lee, JoonHo, et al.
Published: (2025)
Poly-Guard: Massive Multi-Domain Safety Policy-Grounded Guardrail Dataset
by: Kang, Mintong, et al.
Published: (2025)
by: Kang, Mintong, et al.
Published: (2025)
ZeroDayBench: Evaluating LLM Agents on Unseen Zero-Day Vulnerabilities for Cyberdefense
by: Lau, Nancy, et al.
Published: (2026)
by: Lau, Nancy, et al.
Published: (2026)
PropGuard: Safeguarding LLM-MAS via Propagation-Aware Exploration and Remediation
by: Yan, Bingyu, et al.
Published: (2026)
by: Yan, Bingyu, et al.
Published: (2026)
AIRGuard: Guarding Agent Actions with Runtime Authority Control
by: Qin, Suliu, et al.
Published: (2026)
by: Qin, Suliu, et al.
Published: (2026)
aiXamine: Simplified LLM Safety and Security
by: Deniz, Fatih, et al.
Published: (2025)
by: Deniz, Fatih, et al.
Published: (2025)
MCP-Guard: A Multi-Stage Defense-in-Depth Framework for Securing Model Context Protocol in Agentic AI
by: Xing, Wenpeng, et al.
Published: (2025)
by: Xing, Wenpeng, et al.
Published: (2025)
AEGIS : Automated Co-Evolutionary Framework for Guarding Prompt Injections Schema
by: Liu, Ting-Chun, et al.
Published: (2025)
by: Liu, Ting-Chun, et al.
Published: (2025)
AI Kill Switch for malicious web-based LLM agent
by: Lee, Sechan, et al.
Published: (2025)
by: Lee, Sechan, et al.
Published: (2025)
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
by: Liu, Yue, et al.
Published: (2025)
by: Liu, Yue, et al.
Published: (2025)
CourtGuard: A Local, Multiagent Prompt Injection Classifier
by: Wu, Isaac, et al.
Published: (2025)
by: Wu, Isaac, et al.
Published: (2025)
SAGE: A Generic Framework for LLM Safety Evaluation
by: Jindal, Madhur, et al.
Published: (2025)
by: Jindal, Madhur, et al.
Published: (2025)
PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking Attacks
by: Shen, Guobin, et al.
Published: (2025)
by: Shen, Guobin, et al.
Published: (2025)
Parallel Test-Time Scaling with Multi-Sequence Verifiers
by: Kim, Yegon, et al.
Published: (2026)
by: Kim, Yegon, et al.
Published: (2026)
Similar Items
-
AgentGuard: Repurposing Agentic Orchestrator for Safety Evaluation of Tool Orchestration
by: Chen, Jizhou, et al.
Published: (2025) -
CrypTorch: PyTorch-based Auto-tuning Compiler for Machine Learning with Multi-party Computation
by: Liu, Jinyu, et al.
Published: (2025) -
AdaptiveGuard: Towards Adaptive Runtime Safety for LLM-Powered Software
by: Yang, Rui, et al.
Published: (2025) -
MultiPhishGuard: An Explainable and Adaptive Multi-Agent LLM System for Phishing Email Detection
by: Xue, Yinuo, et al.
Published: (2025) -
GuardReasoner: Towards Reasoning-based LLM Safeguards
by: Liu, Yue, et al.
Published: (2025)