Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Sriram, Vidyut, Pandita, Sawan, Lakshmanan, Achintya, Shamraj, Aneesh, Saha, Suman
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Cryptography and Security Machine Learning
Online-Zugang:	https://arxiv.org/abs/2601.00509
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866912800471777280
author	Sriram, Vidyut Pandita, Sawan Lakshmanan, Achintya Shamraj, Aneesh Saha, Suman
author_facet	Sriram, Vidyut Pandita, Sawan Lakshmanan, Achintya Shamraj, Aneesh Saha, Suman
contents	Large Language Models (LLMs) can generate code but often introduce security vulnerabilities, logical inconsistencies, and compilation errors. Prior work demonstrates that LLMs benefit substantially from structured feedback, static analysis, retrieval augmentation, and execution-based refinement. We propose a retrieval-augmented, multi-tool repair workflow in which a single code-generating LLM iteratively refines its outputs using compiler diagnostics, CodeQL security scanning, and KLEE symbolic execution. A lightweight embedding model is used for semantic retrieval of previously successful repairs, providing security-focused examples that guide generation. Evaluated on a combined dataset of 3,242 programs generated by DeepSeek-Coder-1.3B and CodeLlama-7B, the system demonstrates significant improvements in robustness. For DeepSeek, security vulnerabilities were reduced by 96%. For the larger CodeLlama model, the critical security defect rate was decreased from 58.55% to 22.19%, highlighting the efficacy of tool-assisted self-repair even on "stubborn" models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_00509
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Improving LLM-Assisted Secure Code Generation through Retrieval-Augmented-Generation and Multi-Tool Feedback Sriram, Vidyut Pandita, Sawan Lakshmanan, Achintya Shamraj, Aneesh Saha, Suman Cryptography and Security Machine Learning Large Language Models (LLMs) can generate code but often introduce security vulnerabilities, logical inconsistencies, and compilation errors. Prior work demonstrates that LLMs benefit substantially from structured feedback, static analysis, retrieval augmentation, and execution-based refinement. We propose a retrieval-augmented, multi-tool repair workflow in which a single code-generating LLM iteratively refines its outputs using compiler diagnostics, CodeQL security scanning, and KLEE symbolic execution. A lightweight embedding model is used for semantic retrieval of previously successful repairs, providing security-focused examples that guide generation. Evaluated on a combined dataset of 3,242 programs generated by DeepSeek-Coder-1.3B and CodeLlama-7B, the system demonstrates significant improvements in robustness. For DeepSeek, security vulnerabilities were reduced by 96%. For the larger CodeLlama model, the critical security defect rate was decreased from 58.55% to 22.19%, highlighting the efficacy of tool-assisted self-repair even on "stubborn" models.
title	Improving LLM-Assisted Secure Code Generation through Retrieval-Augmented-Generation and Multi-Tool Feedback
topic	Cryptography and Security Machine Learning
url	https://arxiv.org/abs/2601.00509

Ähnliche Einträge