Saved in:
Bibliographic Details
Main Authors: Bosschieter, Tomas M., Franca, Luis, Wolk, Jessica, Wu, Yiyuan, Mehta, Bella, Dehoney, Joseph, Kiss, Orsolya, Baker, Fiona C., Zhao, Qingyu, Caruana, Rich, Pohl, Kilian M.
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2506.19937
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908420793171968
author Bosschieter, Tomas M.
Franca, Luis
Wolk, Jessica
Wu, Yiyuan
Mehta, Bella
Dehoney, Joseph
Kiss, Orsolya
Baker, Fiona C.
Zhao, Qingyu
Caruana, Rich
Pohl, Kilian M.
author_facet Bosschieter, Tomas M.
Franca, Luis
Wolk, Jessica
Wu, Yiyuan
Mehta, Bella
Dehoney, Joseph
Kiss, Orsolya
Baker, Fiona C.
Zhao, Qingyu
Caruana, Rich
Pohl, Kilian M.
contents While analyzing the importance of features has become ubiquitous in interpretable machine learning, the joint signal from a group of related features is sometimes overlooked or inadvertently excluded. Neglecting the joint signal could bypass a critical insight: in many instances, the most significant predictors are not isolated features, but rather the combined effect of groups of features. This can be especially problematic for datasets that contain natural groupings of features, including multimodal datasets. This paper introduces a novel approach to determine the importance of a group of features for Generalized Additive Models (GAMs) that is efficient, requires no model retraining, allows defining groups posthoc, permits overlapping groups, and remains meaningful in high-dimensional settings. Moreover, this definition offers a parallel with explained variation in statistics. We showcase properties of our method on three synthetic experiments that illustrate the behavior of group importance across various data regimes. We then demonstrate the importance of groups of features in identifying depressive symptoms from a multimodal neuroscience dataset, and study the importance of social determinants of health after total hip arthroplasty. These two case studies reveal that analyzing group importance offers a more accurate, holistic view of the medical issues compared to a single-feature analysis.
format Preprint
id arxiv_https___arxiv_org_abs_2506_19937
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle The Most Important Features in Generalized Additive Models Might Be Groups of Features
Bosschieter, Tomas M.
Franca, Luis
Wolk, Jessica
Wu, Yiyuan
Mehta, Bella
Dehoney, Joseph
Kiss, Orsolya
Baker, Fiona C.
Zhao, Qingyu
Caruana, Rich
Pohl, Kilian M.
Machine Learning
While analyzing the importance of features has become ubiquitous in interpretable machine learning, the joint signal from a group of related features is sometimes overlooked or inadvertently excluded. Neglecting the joint signal could bypass a critical insight: in many instances, the most significant predictors are not isolated features, but rather the combined effect of groups of features. This can be especially problematic for datasets that contain natural groupings of features, including multimodal datasets. This paper introduces a novel approach to determine the importance of a group of features for Generalized Additive Models (GAMs) that is efficient, requires no model retraining, allows defining groups posthoc, permits overlapping groups, and remains meaningful in high-dimensional settings. Moreover, this definition offers a parallel with explained variation in statistics. We showcase properties of our method on three synthetic experiments that illustrate the behavior of group importance across various data regimes. We then demonstrate the importance of groups of features in identifying depressive symptoms from a multimodal neuroscience dataset, and study the importance of social determinants of health after total hip arthroplasty. These two case studies reveal that analyzing group importance offers a more accurate, holistic view of the medical issues compared to a single-feature analysis.
title The Most Important Features in Generalized Additive Models Might Be Groups of Features
topic Machine Learning
url https://arxiv.org/abs/2506.19937