Saved in:
| Main Authors: | Huang, Yifan, Jia, Xiaojun, Guo, Wenbo, Sun, Yuqiang, Huang, Yihao, Wang, Chong, Liu, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.21236 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning
by: Sun, Yuqiang, et al.
Published: (2024)
by: Sun, Yuqiang, et al.
Published: (2024)
Do Fine-Tuned LLMs Understand Vulnerabilities? An Investigation into the Semantic Trap
by: Huang, Feiyang, et al.
Published: (2026)
by: Huang, Feiyang, et al.
Published: (2026)
Hackers or Hallucinators? A Comprehensive Analysis of LLM-Based Automated Penetration Testing
by: Peng, Jiaren, et al.
Published: (2026)
by: Peng, Jiaren, et al.
Published: (2026)
GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis
by: Sun, Yuqiang, et al.
Published: (2023)
by: Sun, Yuqiang, et al.
Published: (2023)
Bridging Expert Reasoning and LLM Detection: A Knowledge-Driven Framework for Malicious Packages
by: Guo, Wenbo, et al.
Published: (2026)
by: Guo, Wenbo, et al.
Published: (2026)
Co-PatcheR: Collaborative Software Patching with Component(s)-specific Small Reasoning Models
by: Tang, Yuheng, et al.
Published: (2025)
by: Tang, Yuheng, et al.
Published: (2025)
LLM-enabled Applications Require System-Level Threat Monitoring
by: Zhang, Yedi, et al.
Published: (2026)
by: Zhang, Yedi, et al.
Published: (2026)
DeepGuard: Secure Code Generation via Multi-Layer Semantic Aggregation
by: Huang, Li, et al.
Published: (2026)
by: Huang, Li, et al.
Published: (2026)
Taint-Style Vulnerability Detection and Confirmation for Node.js Packages Using LLM Agent Reasoning
by: Ni, Ronghao, et al.
Published: (2026)
by: Ni, Ronghao, et al.
Published: (2026)
Learning to Generate Secure Code via Token-Level Rewards
by: Quan, Jiazheng, et al.
Published: (2026)
by: Quan, Jiazheng, et al.
Published: (2026)
Detecting Privilege Escalation in Polyglot Microservices via Agentic Program Analysis
by: Li, Penghui, et al.
Published: (2026)
by: Li, Penghui, et al.
Published: (2026)
RefleXGen:The unexamined code is not worth using
by: Wang, Bin, et al.
Published: (2025)
by: Wang, Bin, et al.
Published: (2025)
Knowdit: Agentic Smart Contract Vulnerability Detection with Auditing Knowledge Summarization
by: Kong, Ziqiao, et al.
Published: (2026)
by: Kong, Ziqiao, et al.
Published: (2026)
Fortifying LLM-Based Code Generation with Graph-Based Reasoning on Secure Coding Practices
by: Patir, Rupam, et al.
Published: (2025)
by: Patir, Rupam, et al.
Published: (2025)
DevOps-Gym: Benchmarking AI Agents in Software DevOps Cycle
by: Tang, Yuheng, et al.
Published: (2026)
by: Tang, Yuheng, et al.
Published: (2026)
BandFuzz: An ML-powered Collaborative Fuzzing Framework
by: Shi, Wenxuan, et al.
Published: (2025)
by: Shi, Wenxuan, et al.
Published: (2025)
OpenSage: Self-programming Agent Generation Engine
by: Li, Hongwei, et al.
Published: (2026)
by: Li, Hongwei, et al.
Published: (2026)
Cutting the Gordian Knot: Detecting Malicious PyPI Packages via a Knowledge-Mining Framework
by: Guo, Wenbo, et al.
Published: (2026)
by: Guo, Wenbo, et al.
Published: (2026)
RACC: Representation-Aware Coverage Criteria for LLM Safety Testing
by: Wei, Zeming, et al.
Published: (2026)
by: Wei, Zeming, et al.
Published: (2026)
Guiding Symbolic Execution with Static Analysis and LLMs for Vulnerability Discovery
by: Shafiuzzaman, Md, et al.
Published: (2026)
by: Shafiuzzaman, Md, et al.
Published: (2026)
AdaptiveGuard: Towards Adaptive Runtime Safety for LLM-Powered Software
by: Yang, Rui, et al.
Published: (2025)
by: Yang, Rui, et al.
Published: (2025)
Killing Two Birds with One Stone: Malicious Package Detection in NPM and PyPI using a Single Model of Malicious Behavior Sequence
by: Zhang, Junan, et al.
Published: (2023)
by: Zhang, Junan, et al.
Published: (2023)
Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks
by: Chu, Junjie, et al.
Published: (2026)
by: Chu, Junjie, et al.
Published: (2026)
ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection
by: Weng, Shihao, et al.
Published: (2026)
by: Weng, Shihao, et al.
Published: (2026)
PentestEval: Benchmarking LLM-based Penetration Testing with Modular and Stage-Level Design
by: Yang, Ruozhao, et al.
Published: (2025)
by: Yang, Ruozhao, et al.
Published: (2025)
Testing Storage-System Correctness: Challenges, Fuzzing Limitations, and AI-Augmented Opportunities
by: Wang, Ying, et al.
Published: (2026)
by: Wang, Ying, et al.
Published: (2026)
QLPro: Automated Code Vulnerability Discovery via LLM and Static Code Analysis Integration
by: Hu, Junze, et al.
Published: (2025)
by: Hu, Junze, et al.
Published: (2025)
LLM-Assisted Model-Based Fuzzing of Protocol Implementations
by: Huang, Changze, et al.
Published: (2025)
by: Huang, Changze, et al.
Published: (2025)
Evaluating Large Language Models for Line-Level Vulnerability Localization
by: Zhang, Jian, et al.
Published: (2024)
by: Zhang, Jian, et al.
Published: (2024)
MILE: A Mutation Testing Framework of In-Context Learning Systems
by: Wei, Zeming, et al.
Published: (2024)
by: Wei, Zeming, et al.
Published: (2024)
Implicit Patterns in LLM-Based Binary Analysis
by: Li, Qiang, et al.
Published: (2026)
by: Li, Qiang, et al.
Published: (2026)
SAEL: Leveraging Large Language Models with Adaptive Mixture-of-Experts for Smart Contract Vulnerability Detection
by: Yu, Lei, et al.
Published: (2025)
by: Yu, Lei, et al.
Published: (2025)
A Systematic Study of LLM-Based Architectures for Automated Patching
by: Xu, Qingxiao, et al.
Published: (2026)
by: Xu, Qingxiao, et al.
Published: (2026)
Security of LLM-generated Code: A Comparative Analysis
by: Morkonda, Srivathsan G, et al.
Published: (2026)
by: Morkonda, Srivathsan G, et al.
Published: (2026)
The potential of LLM-generated reports in DevSecOps
by: Lykousas, Nikolaos, et al.
Published: (2024)
by: Lykousas, Nikolaos, et al.
Published: (2024)
Unveiling the Landscape of LLM Deployment in the Wild: An Empirical Study
by: Hou, Xinyi, et al.
Published: (2025)
by: Hou, Xinyi, et al.
Published: (2025)
SIR-Bench: Evaluating Investigation Depth in Security Incident Response Agents
by: Begimher, Daniel, et al.
Published: (2026)
by: Begimher, Daniel, et al.
Published: (2026)
CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions
by: Shi, Jingwei, et al.
Published: (2026)
by: Shi, Jingwei, et al.
Published: (2026)
SCDBench: A Benchmark for LLM-Based Smart Contract Decompilers
by: Qin, Kaihua, et al.
Published: (2026)
by: Qin, Kaihua, et al.
Published: (2026)
SKILLS: Structured Knowledge Injection for LLM-Driven Telecommunications Operations
by: Brett, Ivo
Published: (2026)
by: Brett, Ivo
Published: (2026)
Similar Items
-
LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning
by: Sun, Yuqiang, et al.
Published: (2024) -
Do Fine-Tuned LLMs Understand Vulnerabilities? An Investigation into the Semantic Trap
by: Huang, Feiyang, et al.
Published: (2026) -
Hackers or Hallucinators? A Comprehensive Analysis of LLM-Based Automated Penetration Testing
by: Peng, Jiaren, et al.
Published: (2026) -
GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis
by: Sun, Yuqiang, et al.
Published: (2023) -
Bridging Expert Reasoning and LLM Detection: A Knowledge-Driven Framework for Malicious Packages
by: Guo, Wenbo, et al.
Published: (2026)