Saved in:
Bibliographic Details
Main Author: Sunny, Allen Daniel
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2510.15005
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909852067954688
author Sunny, Allen Daniel
author_facet Sunny, Allen Daniel
contents Feature selection is a fundamental step in model development, shaping both predictive performance and interpretability. Yet, most widely used methods focus on predictive accuracy, and their performance degrades in the presence of correlated predictors. To address this gap, we introduce TangledFeatures, a framework for feature selection in correlated feature spaces. It identifies representative features from groups of entangled predictors, reducing redundancy while retaining explanatory power. The resulting feature subset can be directly applied in downstream models, offering a more interpretable and stable basis for analysis compared to traditional selection techniques. We demonstrate the effectiveness of TangledFeatures on Alanine Dipeptide, applying it to the prediction of backbone torsional angles and show that the selected features correspond to structurally meaningful intra-atomic distances that explain variation in these angles.
format Preprint
id arxiv_https___arxiv_org_abs_2510_15005
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle TangledFeatures: Robust Feature Selection in Highly Correlated Spaces
Sunny, Allen Daniel
Machine Learning
Artificial Intelligence
Feature selection is a fundamental step in model development, shaping both predictive performance and interpretability. Yet, most widely used methods focus on predictive accuracy, and their performance degrades in the presence of correlated predictors. To address this gap, we introduce TangledFeatures, a framework for feature selection in correlated feature spaces. It identifies representative features from groups of entangled predictors, reducing redundancy while retaining explanatory power. The resulting feature subset can be directly applied in downstream models, offering a more interpretable and stable basis for analysis compared to traditional selection techniques. We demonstrate the effectiveness of TangledFeatures on Alanine Dipeptide, applying it to the prediction of backbone torsional angles and show that the selected features correspond to structurally meaningful intra-atomic distances that explain variation in these angles.
title TangledFeatures: Robust Feature Selection in Highly Correlated Spaces
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2510.15005