Guardado en:
| Autores principales: | Bakker, Isabelle, Hastings, John |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2508.01054 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Predicting Known Vulnerabilities from Attack Descriptions Using Sentence Transformers
por: Othman, Refat
Publicado: (2026)
por: Othman, Refat
Publicado: (2026)
AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models
por: Dawson, Ads, et al.
Publicado: (2025)
por: Dawson, Ads, et al.
Publicado: (2025)
The Automation Advantage in AI Red Teaming
por: Mulla, Rob, et al.
Publicado: (2025)
por: Mulla, Rob, et al.
Publicado: (2025)
Governance Architecture for Autonomous Agent Systems: Threats, Framework, and Engineering Practice
por: Ge, Yuxu
Publicado: (2026)
por: Ge, Yuxu
Publicado: (2026)
Refusal Evaluation in Coding LLMs and Code Agents: A Systematic Review of Thirteen Malicious-Code Prompt Corpora (2023-2025)
por: Young, Richard J., et al.
Publicado: (2026)
por: Young, Richard J., et al.
Publicado: (2026)
Privately Fine-Tuned LLMs Preserve Temporal Dynamics in Tabular Data
por: Rosenblatt, Lucas, et al.
Publicado: (2026)
por: Rosenblatt, Lucas, et al.
Publicado: (2026)
AegisShield: Democratizing Cyber Threat Modeling with Generative AI
por: Grofsky, Matthew
Publicado: (2025)
por: Grofsky, Matthew
Publicado: (2025)
Evaluating the Reliability of Digital Forensic Evidence Discovered by Large Language Model: A Case Study
por: Khatiwala, Jeel Piyushkumar, et al.
Publicado: (2026)
por: Khatiwala, Jeel Piyushkumar, et al.
Publicado: (2026)
Multilingual AI-Driven Password Strength Estimation with Similarity-Based Detection
por: Palaniappan, Nikitha M., et al.
Publicado: (2026)
por: Palaniappan, Nikitha M., et al.
Publicado: (2026)
Towards Agentic Investigation of Security Alerts
por: Eilertsen, Even, et al.
Publicado: (2026)
por: Eilertsen, Even, et al.
Publicado: (2026)
AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification
por: Zhang, Tian, et al.
Publicado: (2026)
por: Zhang, Tian, et al.
Publicado: (2026)
Safeguarding Virtual Healthcare: A Novel Attacker-Centric Model for Data Security and Privacy
por: Herath, Suvineetha, et al.
Publicado: (2024)
por: Herath, Suvineetha, et al.
Publicado: (2024)
Quantifying Return on Security Controls in LLM Systems
por: Moulton, Richard Helder, et al.
Publicado: (2025)
por: Moulton, Richard Helder, et al.
Publicado: (2025)
AI Bill of Materials and Beyond: Systematizing Security Assurance through the AI Risk Scanning (AIRS) Framework
por: Nathanson, Samuel, et al.
Publicado: (2025)
por: Nathanson, Samuel, et al.
Publicado: (2025)
Security Considerations for Multi-agent Systems
por: Nguyen, Tam, et al.
Publicado: (2026)
por: Nguyen, Tam, et al.
Publicado: (2026)
Toward Secure and Compliant AI: Organizational Standards and Protocols for NLP Model Lifecycle Management
por: Arora, Sunil, et al.
Publicado: (2025)
por: Arora, Sunil, et al.
Publicado: (2025)
Code as a Weapon: A Consensus-Labeled Prompt Bank for Measuring Coding-Model Compliance with Malicious-Code Requests
por: Young, Richard J., et al.
Publicado: (2026)
por: Young, Richard J., et al.
Publicado: (2026)
SBASH: a Framework for Designing and Evaluating RAG vs. Prompt-Tuned LLM Honeypots
por: Adebimpe, Adetayo, et al.
Publicado: (2025)
por: Adebimpe, Adetayo, et al.
Publicado: (2025)
Securing Agentic AI Systems -- A Multilayer Security Framework
por: Arora, Sunil, et al.
Publicado: (2025)
por: Arora, Sunil, et al.
Publicado: (2025)
Binary-30K: A Heterogeneous Dataset for Deep Learning in Binary Analysis and Malware Detection
por: Bommarito II, Michael J.
Publicado: (2025)
por: Bommarito II, Michael J.
Publicado: (2025)
Detecting Prompt Injection Attacks Against Application Using Classifiers
por: Shaheer, Safwan, et al.
Publicado: (2025)
por: Shaheer, Safwan, et al.
Publicado: (2025)
Beyond the Benchmark: Innovative Defenses Against Prompt Injection Attacks
por: Shaheer, Safwan, et al.
Publicado: (2025)
por: Shaheer, Safwan, et al.
Publicado: (2025)
Can AI Lower the Barrier to Cybersecurity? A Human-Centered Mixed-Methods Study of Novice CTF Learning
por: Schachner, Cathrin, et al.
Publicado: (2026)
por: Schachner, Cathrin, et al.
Publicado: (2026)
Hidden in Memory: Sleeper Memory Poisoning in LLM Agents
por: Pulipaka, Sidharth, et al.
Publicado: (2026)
por: Pulipaka, Sidharth, et al.
Publicado: (2026)
A Secure, Manifest-Based Framework for Delegated Privilege Promotion
por: Chowdhury, Rajarshi, et al.
Publicado: (2026)
por: Chowdhury, Rajarshi, et al.
Publicado: (2026)
MEMSAD: Gradient-Coupled Anomaly Detection for Memory Poisoning in Retrieval-Augmented Agents
por: Gowda, Ishrith
Publicado: (2026)
por: Gowda, Ishrith
Publicado: (2026)
Can Safety Fine-Tuning Be More Principled? Lessons Learned from Cybersecurity
por: Williams-King, David, et al.
Publicado: (2025)
por: Williams-King, David, et al.
Publicado: (2025)
Attacking interpretable NLP systems
por: Abdukhamidov, Eldor, et al.
Publicado: (2025)
por: Abdukhamidov, Eldor, et al.
Publicado: (2025)
Beyond Static Sandboxing: Learned Capability Governance for Autonomous AI Agents
por: Sidik, Bronislav, et al.
Publicado: (2026)
por: Sidik, Bronislav, et al.
Publicado: (2026)
Can AI Keep a Secret? Contextual Integrity Verification: A Provable Security Architecture for LLMs
por: Gupta, Aayush
Publicado: (2025)
por: Gupta, Aayush
Publicado: (2025)
CEKER: A Generalizable LLM Framework for Literature Analysis with a Case Study in Unikernel Security
por: Wollman, Alex, et al.
Publicado: (2024)
por: Wollman, Alex, et al.
Publicado: (2024)
Secure and Scalable Blockchain Voting: A Comparative Framework and the Role of Large Language Models
por: Kiashemshaki, Kiana, et al.
Publicado: (2025)
por: Kiashemshaki, Kiana, et al.
Publicado: (2025)
Tatemae: Detecting Alignment Faking via Tool Selection in LLMs
por: Leonesi, Matteo, et al.
Publicado: (2026)
por: Leonesi, Matteo, et al.
Publicado: (2026)
Terminal Wrench: A Dataset of 331 Reward-Hackable Environments and 3,632 Exploit Trajectories
por: Bercovich, Ivan, et al.
Publicado: (2026)
por: Bercovich, Ivan, et al.
Publicado: (2026)
Confronting the Reproducibility Crisis: A Case Study of Challenges in Cybersecurity AI
por: Moulton, Richard H., et al.
Publicado: (2024)
por: Moulton, Richard H., et al.
Publicado: (2024)
DP-2Stage: Adapting Language Models as Differentially Private Tabular Data Generators
por: Afonja, Tejumade, et al.
Publicado: (2024)
por: Afonja, Tejumade, et al.
Publicado: (2024)
RADEP: A Resilient Adaptive Defense Framework Against Model Extraction Attacks
por: Chakraborty, Amit, et al.
Publicado: (2025)
por: Chakraborty, Amit, et al.
Publicado: (2025)
Continuous Discovery of Vulnerabilities in LLM Serving Systems with Fuzzing
por: Zhao, Yunze, et al.
Publicado: (2026)
por: Zhao, Yunze, et al.
Publicado: (2026)
CTF as a Service: A reproducible and scalable infrastructure for cybersecurity training
por: Miguel, Carlos Jimeno, et al.
Publicado: (2026)
por: Miguel, Carlos Jimeno, et al.
Publicado: (2026)
Static Attribution of Android Residential Proxy Malware Using Graph Kernels
por: Clark, Peter, et al.
Publicado: (2026)
por: Clark, Peter, et al.
Publicado: (2026)
Ejemplares similares
-
Predicting Known Vulnerabilities from Attack Descriptions Using Sentence Transformers
por: Othman, Refat
Publicado: (2026) -
AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models
por: Dawson, Ads, et al.
Publicado: (2025) -
The Automation Advantage in AI Red Teaming
por: Mulla, Rob, et al.
Publicado: (2025) -
Governance Architecture for Autonomous Agent Systems: Threats, Framework, and Engineering Practice
por: Ge, Yuxu
Publicado: (2026) -
Refusal Evaluation in Coding LLMs and Code Agents: A Systematic Review of Thirteen Malicious-Code Prompt Corpora (2023-2025)
por: Young, Richard J., et al.
Publicado: (2026)