Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Tang, Xunzhu, Chen, Zhenghan, Kim, Kisub, Tian, Haoye, Ezzini, Saad, Klein, Jacques
Format:	Preprint
Published:	2023
Subjects:	Cryptography and Security Artificial Intelligence
Online Access:	https://arxiv.org/abs/2312.01241
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910713818120192
author	Tang, Xunzhu Chen, Zhenghan Kim, Kisub Tian, Haoye Ezzini, Saad Klein, Jacques
author_facet	Tang, Xunzhu Chen, Zhenghan Kim, Kisub Tian, Haoye Ezzini, Saad Klein, Jacques
contents	Open-source code is pervasive. In this setting, embedded vulnerabilities are spreading to downstream software at an alarming rate. While such vulnerabilities are generally identified and addressed rapidly, inconsistent maintenance policies may lead security patches to go unnoticed. Indeed, security patches can be {\em silent}, i.e., they do not always come with comprehensive advisories such as CVEs. This lack of transparency leaves users oblivious to available security updates, providing ample opportunity for attackers to exploit unpatched vulnerabilities. Consequently, identifying silent security patches just in time when they are released is essential for preventing n-day attacks, and for ensuring robust and secure maintenance practices. With LLMDA we propose to (1) leverage large language models (LLMs) to augment patch information with generated code change explanations, (2) design a representation learning approach that explores code-text alignment methodologies for feature combination, (3) implement a label-wise training with labelled instructions for guiding the embedding based on security relevance, and (4) rely on a probabilistic batch contrastive learning mechanism for building a high-precision identifier of security patches. We evaluate LLMDA on the PatchDB and SPI-DB literature datasets and show that our approach substantially improves over the state-of-the-art, notably GraphSPD by 20% in terms of F-Measure on the SPI-DB benchmark.
format	Preprint
id	arxiv_https___arxiv_org_abs_2312_01241
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Just-in-Time Detection of Silent Security Patches Tang, Xunzhu Chen, Zhenghan Kim, Kisub Tian, Haoye Ezzini, Saad Klein, Jacques Cryptography and Security Artificial Intelligence Open-source code is pervasive. In this setting, embedded vulnerabilities are spreading to downstream software at an alarming rate. While such vulnerabilities are generally identified and addressed rapidly, inconsistent maintenance policies may lead security patches to go unnoticed. Indeed, security patches can be {\em silent}, i.e., they do not always come with comprehensive advisories such as CVEs. This lack of transparency leaves users oblivious to available security updates, providing ample opportunity for attackers to exploit unpatched vulnerabilities. Consequently, identifying silent security patches just in time when they are released is essential for preventing n-day attacks, and for ensuring robust and secure maintenance practices. With LLMDA we propose to (1) leverage large language models (LLMs) to augment patch information with generated code change explanations, (2) design a representation learning approach that explores code-text alignment methodologies for feature combination, (3) implement a label-wise training with labelled instructions for guiding the embedding based on security relevance, and (4) rely on a probabilistic batch contrastive learning mechanism for building a high-precision identifier of security patches. We evaluate LLMDA on the PatchDB and SPI-DB literature datasets and show that our approach substantially improves over the state-of-the-art, notably GraphSPD by 20% in terms of F-Measure on the SPI-DB benchmark.
title	Just-in-Time Detection of Silent Security Patches
topic	Cryptography and Security Artificial Intelligence
url	https://arxiv.org/abs/2312.01241

Similar Items