Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Dervişoğlu, Havvanur, Halepmollası, Ruşen, Eyvaz, Elif
Format:	Preprint
Published:	2025
Subjects:	Software Engineering
Online Access:	https://arxiv.org/abs/2506.22752
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908427212554240
author	Dervişoğlu, Havvanur Halepmollası, Ruşen Eyvaz, Elif
author_facet	Dervişoğlu, Havvanur Halepmollası, Ruşen Eyvaz, Elif
contents	Bug severity prediction is a critical task in software engineering as it enables more efficient resource allocation and prioritization in software maintenance. While AI-based analyses and models significantly require access to extensive datasets, industrial applications face challenges due to data-sharing constraints and the limited availability of labeled data. In this study, we investigate method-level bug severity prediction using source code metrics and Large Language Models (LLMs) with two widely used datasets. We compare the performance of models trained using centralized learning, federated learning, and synthetic data generation. Our experimental results, obtained using two widely recognized software defect datasets, indicate that models trained with federated learning and synthetic data achieve comparable results to centrally trained models without data sharing. Our finding highlights the potential of privacy-preserving approaches such as federated learning and synthetic data generation to enable effective bug severity prediction in industrial context where data sharing is a major challenge. The source code and dataset are available at our GitHub repository: https://github.com/drvshavva/EASE2025-Privacy-Preserving-Methods-for-Bug-Severity-Prediction.
format	Preprint
id	arxiv_https___arxiv_org_abs_2506_22752
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Privacy-Preserving Methods for Bug Severity Prediction Dervişoğlu, Havvanur Halepmollası, Ruşen Eyvaz, Elif Software Engineering Bug severity prediction is a critical task in software engineering as it enables more efficient resource allocation and prioritization in software maintenance. While AI-based analyses and models significantly require access to extensive datasets, industrial applications face challenges due to data-sharing constraints and the limited availability of labeled data. In this study, we investigate method-level bug severity prediction using source code metrics and Large Language Models (LLMs) with two widely used datasets. We compare the performance of models trained using centralized learning, federated learning, and synthetic data generation. Our experimental results, obtained using two widely recognized software defect datasets, indicate that models trained with federated learning and synthetic data achieve comparable results to centrally trained models without data sharing. Our finding highlights the potential of privacy-preserving approaches such as federated learning and synthetic data generation to enable effective bug severity prediction in industrial context where data sharing is a major challenge. The source code and dataset are available at our GitHub repository: https://github.com/drvshavva/EASE2025-Privacy-Preserving-Methods-for-Bug-Severity-Prediction.
title	Privacy-Preserving Methods for Bug Severity Prediction
topic	Software Engineering
url	https://arxiv.org/abs/2506.22752

Similar Items