Saved in:
| Main Author: | |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.01941 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866918184562458624 |
|---|---|
| author | Masoumzadeh, Sogol |
| author_facet | Masoumzadeh, Sogol |
| contents | Timely identification of issue reports reflecting software vulnerabilities is crucial, particularly for Internet-of-Things (IoT) where analysis is slower than non-IoT systems. While Machine Learning (ML) and Large Language Models (LLMs) detect vulnerability-indicating issues in non-IoT systems, their IoT use remains unexplored. We are the first to tackle this problem by proposing two approaches: (1) combining ML and LLMs with Natural Language Processing (NLP) techniques to detect vulnerability-indicating issues of 21 Eclipse IoT projects and (2) fine-tuning a pre-trained BERT Masked Language Model (MLM) on 11,000 GitHub issues for classifying \vul. Our best performance belongs to a Support Vector Machine (SVM) trained on BERT NLP features, achieving an Area Under the receiver operator characteristic Curve (AUC) of 0.65. The fine-tuned BERT achieves 0.26 accuracy, emphasizing the importance of exposing all data during training. Our contributions set the stage for accurately detecting IoT vulnerabilities from issue reports, similar to non-IoT systems. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2511_01941 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Detecting Vulnerabilities from Issue Reports for Internet-of-Things Masoumzadeh, Sogol Software Engineering Artificial Intelligence Cryptography and Security Timely identification of issue reports reflecting software vulnerabilities is crucial, particularly for Internet-of-Things (IoT) where analysis is slower than non-IoT systems. While Machine Learning (ML) and Large Language Models (LLMs) detect vulnerability-indicating issues in non-IoT systems, their IoT use remains unexplored. We are the first to tackle this problem by proposing two approaches: (1) combining ML and LLMs with Natural Language Processing (NLP) techniques to detect vulnerability-indicating issues of 21 Eclipse IoT projects and (2) fine-tuning a pre-trained BERT Masked Language Model (MLM) on 11,000 GitHub issues for classifying \vul. Our best performance belongs to a Support Vector Machine (SVM) trained on BERT NLP features, achieving an Area Under the receiver operator characteristic Curve (AUC) of 0.65. The fine-tuned BERT achieves 0.26 accuracy, emphasizing the importance of exposing all data during training. Our contributions set the stage for accurately detecting IoT vulnerabilities from issue reports, similar to non-IoT systems. |
| title | Detecting Vulnerabilities from Issue Reports for Internet-of-Things |
| topic | Software Engineering Artificial Intelligence Cryptography and Security |
| url | https://arxiv.org/abs/2511.01941 |