Saved in:
| Main Author: | Barnes, Jarrod |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.21083 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Surprisal-Guided Selection: Compute-Optimal Test-Time Strategies for Execution-Grounded Code Generation
by: Barnes, Jarrod
Published: (2026)
by: Barnes, Jarrod
Published: (2026)
Continual Learning, Not Training: Online Adaptation For Agents
by: Jaglan, Aman, et al.
Published: (2025)
by: Jaglan, Aman, et al.
Published: (2025)
AIR: Improving Agent Safety through Incident Response
by: Xiao, Zibo, et al.
Published: (2026)
by: Xiao, Zibo, et al.
Published: (2026)
Anticipating Adversary Behavior in DevSecOps Scenarios through Large Language Models
by: Caballero, Mario Marín, et al.
Published: (2026)
by: Caballero, Mario Marín, et al.
Published: (2026)
SecMate: Multi-Agent Adaptive Cybersecurity Troubleshooting with Tri-Context Personalization
by: Meidan, Yair, et al.
Published: (2026)
by: Meidan, Yair, et al.
Published: (2026)
SIR-Bench: Evaluating Investigation Depth in Security Incident Response Agents
by: Begimher, Daniel, et al.
Published: (2026)
by: Begimher, Daniel, et al.
Published: (2026)
TheraAgent: Multi-Agent Framework with Self-Evolving Memory and Evidence-Calibrated Reasoning for PET Theranostics
by: Chen, Zhihao, et al.
Published: (2026)
by: Chen, Zhihao, et al.
Published: (2026)
Incident Analysis for AI Agents
by: Ezell, Carson, et al.
Published: (2025)
by: Ezell, Carson, et al.
Published: (2025)
SecRepoBench: Benchmarking Code Agents for Secure Code Completion in Real-World Repositories
by: Shen, Chihao, et al.
Published: (2025)
by: Shen, Chihao, et al.
Published: (2025)
Anomaly Detection for Incident Response at Scale
by: Wang, Hanzhang, et al.
Published: (2024)
by: Wang, Hanzhang, et al.
Published: (2024)
Measuring Responsibility in Multi-Agent Systems
by: Mu, Chunyan, et al.
Published: (2024)
by: Mu, Chunyan, et al.
Published: (2024)
EU-Agent-Bench: Measuring Illegal Behavior of LLM Agents Under EU Law
by: Lichkovski, Ilija, et al.
Published: (2025)
by: Lichkovski, Ilija, et al.
Published: (2025)
In-Context Autonomous Network Incident Response: An End-to-End Large Language Model Agent Approach
by: Gao, Yiran, et al.
Published: (2026)
by: Gao, Yiran, et al.
Published: (2026)
Cloud-based XAI Services for Assessing Open Repository Models Under Adversarial Attacks
by: Wang, Zerui, et al.
Published: (2024)
by: Wang, Zerui, et al.
Published: (2024)
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
by: Becker, Joel, et al.
Published: (2025)
by: Becker, Joel, et al.
Published: (2025)
Rethinking CyberSecEval: An LLM-Aided Approach to Evaluation Critique
by: Hariharan, Suhas, et al.
Published: (2024)
by: Hariharan, Suhas, et al.
Published: (2024)
OpenApps: Simulating Environment Variations to Measure UI-Agent Reliability
by: Ullrich, Karen, et al.
Published: (2025)
by: Ullrich, Karen, et al.
Published: (2025)
ShIOEnv: A Command Evaluation Environment for Grammar-Constrained Synthesis and Execution Behavior Modeling
by: Ragsdale, Jarrod, et al.
Published: (2025)
by: Ragsdale, Jarrod, et al.
Published: (2025)
AI Loss of Control Incident Management: Response & Resilience
by: Gruetzemacher, Ross
Published: (2026)
by: Gruetzemacher, Ross
Published: (2026)
AI for DevSecOps: A Landscape and Future Opportunities
by: Fu, Michael, et al.
Published: (2024)
by: Fu, Michael, et al.
Published: (2024)
SciIntBench: Measuring LLM Compliance with Research Integrity Norms Under Adversarial Framing
by: Meguimtsop, Almene De Meran, et al.
Published: (2026)
by: Meguimtsop, Almene De Meran, et al.
Published: (2026)
Automated Traffic Incident Response Plans using Generative Artificial Intelligence: Part 1 -- Building the Incident Response Benchmark
by: Grigorev, Artur, et al.
Published: (2025)
by: Grigorev, Artur, et al.
Published: (2025)
OpsAgent: An Evolving Multi-agent System for Incident Management in Microservices
by: Luo, Yu, et al.
Published: (2025)
by: Luo, Yu, et al.
Published: (2025)
SecCodeBench-V2 Technical Report
by: Chen, Longfei, et al.
Published: (2026)
by: Chen, Longfei, et al.
Published: (2026)
The potential of LLM-generated reports in DevSecOps
by: Lykousas, Nikolaos, et al.
Published: (2024)
by: Lykousas, Nikolaos, et al.
Published: (2024)
Relevant Is Not Warranted: Evidence-Force Calibration for Cited RAG
by: Qian, Pin, et al.
Published: (2026)
by: Qian, Pin, et al.
Published: (2026)
SecInfer: Preventing Prompt Injection via Inference-time Scaling
by: Liu, Yupei, et al.
Published: (2025)
by: Liu, Yupei, et al.
Published: (2025)
TIPS: Threat Actor Informed Prioritization of Applications using SecEncoder
by: Bulut, Muhammed Fatih, et al.
Published: (2024)
by: Bulut, Muhammed Fatih, et al.
Published: (2024)
Agents of Change: Self-Evolving LLM Agents for Strategic Planning
by: Belle, Nikolas, et al.
Published: (2025)
by: Belle, Nikolas, et al.
Published: (2025)
Chances and Challenges of the Model Context Protocol in Digital Forensics and Incident Response
by: Hilgert, Jan-Niclas, et al.
Published: (2025)
by: Hilgert, Jan-Niclas, et al.
Published: (2025)
HardSecBench: Benchmarking the Security Awareness of LLMs for Hardware Code Generation
by: Chen, Qirui, et al.
Published: (2026)
by: Chen, Qirui, et al.
Published: (2026)
SecPE: Secure Prompt Ensembling for Private and Robust Large Language Models
by: Zhang, Jiawen, et al.
Published: (2025)
by: Zhang, Jiawen, et al.
Published: (2025)
SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity
by: Jing, Pengfei, et al.
Published: (2024)
by: Jing, Pengfei, et al.
Published: (2024)
A Hybrid Model for Traffic Incident Detection based on Generative Adversarial Networks and Transformer Model
by: Lu, Xinying, et al.
Published: (2024)
by: Lu, Xinying, et al.
Published: (2024)
SecEncoder: Logs are All You Need in Security
by: Bulut, Muhammed Fatih, et al.
Published: (2024)
by: Bulut, Muhammed Fatih, et al.
Published: (2024)
SecPI: Secure Code Generation with Reasoning Models via Security Reasoning Internalization
by: Wang, Hao, et al.
Published: (2026)
by: Wang, Hao, et al.
Published: (2026)
Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks
by: Chen, Sizhe, et al.
Published: (2025)
by: Chen, Sizhe, et al.
Published: (2025)
Robust Calibration For Improved Weather Prediction Under Distributional Shift
by: Gilda, Sankalp, et al.
Published: (2024)
by: Gilda, Sankalp, et al.
Published: (2024)
ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration
by: Chen, Yifei, et al.
Published: (2026)
by: Chen, Yifei, et al.
Published: (2026)
Development and Evaluation of an Ontology for Non-Invasive Respiratory Support in Acute Care
by: Islam, Md Fantacher, et al.
Published: (2025)
by: Islam, Md Fantacher, et al.
Published: (2025)
Similar Items
-
Surprisal-Guided Selection: Compute-Optimal Test-Time Strategies for Execution-Grounded Code Generation
by: Barnes, Jarrod
Published: (2026) -
Continual Learning, Not Training: Online Adaptation For Agents
by: Jaglan, Aman, et al.
Published: (2025) -
AIR: Improving Agent Safety through Incident Response
by: Xiao, Zibo, et al.
Published: (2026) -
Anticipating Adversary Behavior in DevSecOps Scenarios through Large Language Models
by: Caballero, Mario Marín, et al.
Published: (2026) -
SecMate: Multi-Agent Adaptive Cybersecurity Troubleshooting with Tri-Context Personalization
by: Meidan, Yair, et al.
Published: (2026)