Saved in:
| Main Authors: | Lip, Edward Lue Chee, Channg, Anthony, Kim, Diana, Sandoval, Aaron, Zhu, Kevin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.14745 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Factor(T,U): Factored Cognition Strengthens Monitoring of Untrusted AI
by: Sandoval, Aaron, et al.
Published: (2025)
by: Sandoval, Aaron, et al.
Published: (2025)
Structured Security Auditing and Robustness Enhancement for Untrusted Agent Skills
by: Lv, Lijia, et al.
Published: (2026)
by: Lv, Lijia, et al.
Published: (2026)
PragLocker: Protecting Agent Intellectual Property in Untrusted Deployments via Non-Portable Prompts
by: Li, Qinfeng, et al.
Published: (2026)
by: Li, Qinfeng, et al.
Published: (2026)
Secure Multiparty Generative AI
by: Shrestha, Manil, et al.
Published: (2024)
by: Shrestha, Manil, et al.
Published: (2024)
NetMoniAI: An Agentic AI Framework for Network Security & Monitoring
by: Zambare, Pallavi, et al.
Published: (2025)
by: Zambare, Pallavi, et al.
Published: (2025)
Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols
by: Terekhov, Mikhail, et al.
Published: (2025)
by: Terekhov, Mikhail, et al.
Published: (2025)
LLM Agent Honeypot: Monitoring AI Hacking Agents in the Wild
by: Reworr, et al.
Published: (2024)
by: Reworr, et al.
Published: (2024)
Securing Agentic AI: Threat Modeling and Risk Analysis for Network Monitoring Agentic AI System
by: Zambare, Pallavi, et al.
Published: (2025)
by: Zambare, Pallavi, et al.
Published: (2025)
SmartLLM: Smart Contract Auditing using Custom Generative AI
by: Kevin, Jun, et al.
Published: (2025)
by: Kevin, Jun, et al.
Published: (2025)
SECUREVENT: Hybrid AI/ML Security Monitoring for Distributed Event-Based Systems
by: Liang, Eric
Published: (2026)
by: Liang, Eric
Published: (2026)
Hallucination-Resistant Security Planning with a Large Language Model
by: Hammar, Kim, et al.
Published: (2026)
by: Hammar, Kim, et al.
Published: (2026)
Preventing Adversarial AI Attacks Against Autonomous Situational Awareness: A Maritime Case Study
by: Walter, Mathew J., et al.
Published: (2025)
by: Walter, Mathew J., et al.
Published: (2025)
TraceGuard: Structured Multi-Dimensional Monitoring as a Collusion-Resistant Control Protocol
by: Nguyen, Khanh Linh, et al.
Published: (2026)
by: Nguyen, Khanh Linh, et al.
Published: (2026)
\texttt{Range-Arithmetic}: Verifiable Deep Learning Inference on an Untrusted Party
by: Rahimi, Ali, et al.
Published: (2025)
by: Rahimi, Ali, et al.
Published: (2025)
Monitoring Decomposition Attacks in LLMs with Lightweight Sequential Monitors
by: Yueh-Han, Chen, et al.
Published: (2025)
by: Yueh-Han, Chen, et al.
Published: (2025)
Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, and Improvements
by: Isozaki, Isamu, et al.
Published: (2024)
by: Isozaki, Isamu, et al.
Published: (2024)
ATLANTIS: AI-driven Threat Localization, Analysis, and Triage Intelligence System
by: Kim, Taesoo, et al.
Published: (2025)
by: Kim, Taesoo, et al.
Published: (2025)
SoK: How Robust is Audio Watermarking in Generative AI models?
by: Wen, Yizhu, et al.
Published: (2025)
by: Wen, Yizhu, et al.
Published: (2025)
Securing AI Agents with Information-Flow Control
by: Costa, Manuel, et al.
Published: (2025)
by: Costa, Manuel, et al.
Published: (2025)
Progent: Securing AI Agents with Privilege Control
by: Shi, Tianneng, et al.
Published: (2025)
by: Shi, Tianneng, et al.
Published: (2025)
MonitoringBench: Semi-Automated Red-Teaming for Agent Monitoring
by: Jotautaitė, Monika, et al.
Published: (2026)
by: Jotautaitė, Monika, et al.
Published: (2026)
Human-AI Collaborative Bot Detection in MMORPGs
by: Son, Jaeman, et al.
Published: (2025)
by: Son, Jaeman, et al.
Published: (2025)
Reliable Weak-to-Strong Monitoring of LLM Agents
by: Kale, Neil, et al.
Published: (2025)
by: Kale, Neil, et al.
Published: (2025)
Incident Response Planning Using a Lightweight Large Language Model with Reduced Hallucination
by: Hammar, Kim, et al.
Published: (2025)
by: Hammar, Kim, et al.
Published: (2025)
Enigma: Application-Layer Privacy for Quantum Optimization on Untrusted Computers
by: Ayanzadeh, Ramin, et al.
Published: (2023)
by: Ayanzadeh, Ramin, et al.
Published: (2023)
Enterprise AI Must Enforce Participant-Aware Access Control
by: Bhatt, Shashank Shreedhar, et al.
Published: (2025)
by: Bhatt, Shashank Shreedhar, et al.
Published: (2025)
Access Controlled Website Interaction for Agentic AI with Delegated Critical Tasks
by: Kim, Sunyoung, et al.
Published: (2026)
by: Kim, Sunyoung, et al.
Published: (2026)
Towards AI-Driven Human-Machine Co-Teaming for Adaptive and Agile Cyber Security Operation Centers
by: Albanese, Massimiliano, et al.
Published: (2025)
by: Albanese, Massimiliano, et al.
Published: (2025)
Differential Privacy in Generative AI Agents: Analysis and Optimal Tradeoffs
by: Yang, Ya-Ting, et al.
Published: (2026)
by: Yang, Ya-Ting, et al.
Published: (2026)
BashArena: A Control Setting for Highly Privileged AI Agents
by: Kaufman, Adam, et al.
Published: (2025)
by: Kaufman, Adam, et al.
Published: (2025)
Bypassing AI Control Protocols via Agent-as-a-Proxy Attacks
by: Isbarov, Jafar, et al.
Published: (2026)
by: Isbarov, Jafar, et al.
Published: (2026)
Preventing Non-intrusive Load Monitoring Privacy Invasion: A Precise Adversarial Attack Scheme for Networked Smart Meters
by: He, Jialing, et al.
Published: (2024)
by: He, Jialing, et al.
Published: (2024)
Agentic AI for Cyber Resilience: A New Security Paradigm and Its System-Theoretic Foundations
by: Li, Tao, et al.
Published: (2025)
by: Li, Tao, et al.
Published: (2025)
Opal: Private Memory for Personal AI
by: Kaviani, Darya, et al.
Published: (2026)
by: Kaviani, Darya, et al.
Published: (2026)
Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks
by: Xiang, Chong, et al.
Published: (2026)
by: Xiang, Chong, et al.
Published: (2026)
Discovering Command and Control (C2) Channels on Tor and Public Networks Using Reinforcement Learning
by: Wang, Cheng, et al.
Published: (2024)
by: Wang, Cheng, et al.
Published: (2024)
Automatic Jailbreaking of the Text-to-Image Generative AI Systems
by: Kim, Minseon, et al.
Published: (2024)
by: Kim, Minseon, et al.
Published: (2024)
CoT-Guard: Small Models for Strong Monitoring
by: Diwan, Nirav, et al.
Published: (2026)
by: Diwan, Nirav, et al.
Published: (2026)
OML: A Primitive for Reconciling Open Access with Owner Control in AI Model Distribution
by: Cheng, Zerui, et al.
Published: (2024)
by: Cheng, Zerui, et al.
Published: (2024)
AutoControl Arena: Synthesizing Executable Test Environments for Frontier AI Risk Evaluation
by: Li, Changyi, et al.
Published: (2026)
by: Li, Changyi, et al.
Published: (2026)
Similar Items
-
Factor(T,U): Factored Cognition Strengthens Monitoring of Untrusted AI
by: Sandoval, Aaron, et al.
Published: (2025) -
Structured Security Auditing and Robustness Enhancement for Untrusted Agent Skills
by: Lv, Lijia, et al.
Published: (2026) -
PragLocker: Protecting Agent Intellectual Property in Untrusted Deployments via Non-Portable Prompts
by: Li, Qinfeng, et al.
Published: (2026) -
Secure Multiparty Generative AI
by: Shrestha, Manil, et al.
Published: (2024) -
NetMoniAI: An Agentic AI Framework for Network Security & Monitoring
by: Zambare, Pallavi, et al.
Published: (2025)