Saved in:
Bibliographic Details
Main Authors: Sharma, Pradeep Kumar, Godbole, Shantanu, Jena, Sarada Prasad, Shrivastava, Hritvik
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.06185
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912813983727616
author Sharma, Pradeep Kumar
Godbole, Shantanu
Jena, Sarada Prasad
Shrivastava, Hritvik
author_facet Sharma, Pradeep Kumar
Godbole, Shantanu
Jena, Sarada Prasad
Shrivastava, Hritvik
contents The identification and ranking of impacted files within software reposi-tories is a key challenge in change impact analysis. Existing deterministic approaches that combine heuristic signals, semantic similarity measures, and graph-based centrality metrics have demonstrated effectiveness in nar-rowing candidate search spaces, yet their recall plateaus. This limitation stems from the treatment of features as linearly independent contributors, ignoring contextual dependencies and relationships between metrics that characterize expert reasoning patterns. To address this limitation, we propose the application of Multi-Head Self-Attention as a post-deterministic scoring refinement mechanism. Our approach learns contextual weighting between features, dynamically adjust-ing importance levels per file based on relational behavior exhibited across candidate file sets. The attention mechanism produces context-aware adjustments that are additively combined with deterministic scores, pre-serving interpretability while enabling reasoning similar to that performed by experts when reviewing change surfaces. We focus on recall rather than precision, as false negatives (missing impacted files) are far more costly than false positives (irrelevant files that can be quickly dismissed during review). Empirical evaluation on 200 test cases demonstrates that the introduc-tion of self-attention improves Top-50 recall from approximately 62-65% to between 78-82% depending on repository complexity and structure, achiev-ing 80% recall at Top-50 files. Expert validation yields improvement from 6.5/10 to 8.6/10 in subjective accuracy alignment. This transformation bridges the reasoning capability gap between deterministic automation and expert judgment, improving recall in repository-aware effort estimation.
format Preprint
id arxiv_https___arxiv_org_abs_2601_06185
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Attention Mechanism and Heuristic Approach: Context-Aware File Ranking Using Multi-Head Self-Attention
Sharma, Pradeep Kumar
Godbole, Shantanu
Jena, Sarada Prasad
Shrivastava, Hritvik
Software Engineering
Artificial Intelligence
Computation and Language
The identification and ranking of impacted files within software reposi-tories is a key challenge in change impact analysis. Existing deterministic approaches that combine heuristic signals, semantic similarity measures, and graph-based centrality metrics have demonstrated effectiveness in nar-rowing candidate search spaces, yet their recall plateaus. This limitation stems from the treatment of features as linearly independent contributors, ignoring contextual dependencies and relationships between metrics that characterize expert reasoning patterns. To address this limitation, we propose the application of Multi-Head Self-Attention as a post-deterministic scoring refinement mechanism. Our approach learns contextual weighting between features, dynamically adjust-ing importance levels per file based on relational behavior exhibited across candidate file sets. The attention mechanism produces context-aware adjustments that are additively combined with deterministic scores, pre-serving interpretability while enabling reasoning similar to that performed by experts when reviewing change surfaces. We focus on recall rather than precision, as false negatives (missing impacted files) are far more costly than false positives (irrelevant files that can be quickly dismissed during review). Empirical evaluation on 200 test cases demonstrates that the introduc-tion of self-attention improves Top-50 recall from approximately 62-65% to between 78-82% depending on repository complexity and structure, achiev-ing 80% recall at Top-50 files. Expert validation yields improvement from 6.5/10 to 8.6/10 in subjective accuracy alignment. This transformation bridges the reasoning capability gap between deterministic automation and expert judgment, improving recall in repository-aware effort estimation.
title Attention Mechanism and Heuristic Approach: Context-Aware File Ranking Using Multi-Head Self-Attention
topic Software Engineering
Artificial Intelligence
Computation and Language
url https://arxiv.org/abs/2601.06185