:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Gowda, Ishrith
Format:	Preprint
Published:	2026
Subjects:	Cryptography and Security Artificial Intelligence Machine Learning D.4.6; I.2.6; I.2.7
Online Access:	https://arxiv.org/abs/2605.03482
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Hidden in Memory: Sleeper Memory Poisoning in LLM Agents
by: Pulipaka, Sidharth, et al.
Published: (2026)

Binary-30K: A Heterogeneous Dataset for Deep Learning in Binary Analysis and Malware Detection
by: Bommarito II, Michael J.
Published: (2025)

Attacking interpretable NLP systems
by: Abdukhamidov, Eldor, et al.
Published: (2025)

AI Bill of Materials and Beyond: Systematizing Security Assurance through the AI Risk Scanning (AIRS) Framework
by: Nathanson, Samuel, et al.
Published: (2025)

Predicting Known Vulnerabilities from Attack Descriptions Using Sentence Transformers
by: Othman, Refat
Published: (2026)

Refusal Evaluation in Coding LLMs and Code Agents: A Systematic Review of Thirteen Malicious-Code Prompt Corpora (2023-2025)
by: Young, Richard J., et al.
Published: (2026)

Detecting Prompt Injection Attacks Against Application Using Classifiers
by: Shaheer, Safwan, et al.
Published: (2025)

Beyond the Benchmark: Innovative Defenses Against Prompt Injection Attacks
by: Shaheer, Safwan, et al.
Published: (2025)

Governance Architecture for Autonomous Agent Systems: Threats, Framework, and Engineering Practice
by: Ge, Yuxu
Published: (2026)

AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification
by: Zhang, Tian, et al.
Published: (2026)

Multilingual AI-Driven Password Strength Estimation with Similarity-Based Detection
by: Palaniappan, Nikitha M., et al.
Published: (2026)

AegisShield: Democratizing Cyber Threat Modeling with Generative AI
by: Grofsky, Matthew
Published: (2025)

Evaluating the Reliability of Digital Forensic Evidence Discovered by Large Language Model: A Case Study
by: Khatiwala, Jeel Piyushkumar, et al.
Published: (2026)

Towards Agentic Investigation of Security Alerts
by: Eilertsen, Even, et al.
Published: (2026)

Privately Fine-Tuned LLMs Preserve Temporal Dynamics in Tabular Data
by: Rosenblatt, Lucas, et al.
Published: (2026)

Enabling Transparent Cyber Threat Intelligence Combining Large Language Models and Domain Ontologies
by: Cotti, Luca, et al.
Published: (2025)

Can Safety Fine-Tuning Be More Principled? Lessons Learned from Cybersecurity
by: Williams-King, David, et al.
Published: (2025)

Quantifying Return on Security Controls in LLM Systems
by: Moulton, Richard Helder, et al.
Published: (2025)

Security Considerations for Multi-agent Systems
by: Nguyen, Tam, et al.
Published: (2026)

Beyond Static Sandboxing: Learned Capability Governance for Autonomous AI Agents
by: Sidik, Bronislav, et al.
Published: (2026)

Tatemae: Detecting Alignment Faking via Tool Selection in LLMs
by: Leonesi, Matteo, et al.
Published: (2026)

Terminal Wrench: A Dataset of 331 Reward-Hackable Environments and 3,632 Exploit Trajectories
by: Bercovich, Ivan, et al.
Published: (2026)

AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models
by: Dawson, Ads, et al.
Published: (2025)

The Automation Advantage in AI Red Teaming
by: Mulla, Rob, et al.
Published: (2025)

SAND: A Self-supervised and Adaptive NAS-Driven Framework for Hardware Trojan Detection
by: Pan, Zhixin, et al.
Published: (2025)

Can AI Keep a Secret? Contextual Integrity Verification: A Provable Security Architecture for LLMs
by: Gupta, Aayush
Published: (2025)

Poison in the Well: Feature Embedding Disruption in Backdoor Attacks
by: Feng, Zhou, et al.
Published: (2025)

David vs. Goliath: Verifiable Agent-to-Agent Jailbreaking via Reinforcement Learning
by: Nellessen, Samuel, et al.
Published: (2026)

Evaluating Query Efficiency and Accuracy of Transfer Learning-based Model Extraction Attack in Federated Learning
by: Ahamed, Sayyed Farid, et al.
Published: (2025)

Autonomous Penetration Testing: Solving Capture-the-Flag Challenges with LLMs
by: Bakker, Isabelle, et al.
Published: (2025)

A Self-Improving Architecture for Dynamic Safety in Large Language Models
by: Slater, Tyler
Published: (2025)

Retrieval Augmented Classification for Confidential Documents
by: Chang, Yeseul E., et al.
Published: (2026)

SALLIE: Safeguarding Against Latent Language & Image Exploits
by: Azov, Guy, et al.
Published: (2026)

Unlearning at Scale: Implementing the Right to be Forgotten in Large Language Models
by: X, Abdullah
Published: (2025)

Continuous Discovery of Vulnerabilities in LLM Serving Systems with Fuzzing
by: Zhao, Yunze, et al.
Published: (2026)

BioRefusalAudit: Auditing Biosecurity Refusal Depth Using General and Domain-Fine-Tuned Sparse Autoencoders
by: DeLeeuw, Caleb
Published: (2026)

Provable Repair of Deep Neural Network Defects by Preimage Synthesis and Property Refinement
by: Ma, Jianan, et al.
Published: (2025)

RADEP: A Resilient Adaptive Defense Framework Against Model Extraction Attacks
by: Chakraborty, Amit, et al.
Published: (2025)

Digital Forgetting in Large Language Models: A Survey of Unlearning Methods
by: Blanco-Justicia, Alberto, et al.
Published: (2024)

PoTS: Proof-of-Training-Steps for Backdoor Detection in Large Language Models
by: Seddik, Issam, et al.
Published: (2025)