Saved in:
Bibliographic Details
Main Authors: Khan, Mohammad Kaviul Anam, Saarela, Olli, Kustra, Rafal
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.16988
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916587535073280
author Khan, Mohammad Kaviul Anam
Saarela, Olli
Kustra, Rafal
author_facet Khan, Mohammad Kaviul Anam
Saarela, Olli
Kustra, Rafal
contents Interpreting black-box machine learning models is challenging due to their strong dependence on data and inherently non-parametric nature. This paper reintroduces the concept of importance through "Marginal Variable Importance Metric" (MVIM), a model-agnostic measure of predictor importance based on the true conditional expectation function. MVIM evaluates predictors' influence on continuous or discrete outcomes. A permutation-based estimation approach, inspired by \citet{breiman2001random} and \citet{fisher2019all}, is proposed to estimate MVIM. MVIM estimator is biased when predictors are highly correlated, as black-box models struggle to extrapolate in low-probability regions. To address this, we investigated the bias-variance decomposition of MVIM to understand the source and pattern of the bias under high correlation. A Conditional Variable Importance Metric (CVIM), adapted from \citet{strobl2008conditional}, is introduced to reduce this bias. Both MVIM and CVIM exhibit a quadratic relationship with the conditional average treatment effect (CATE).
format Preprint
id arxiv_https___arxiv_org_abs_2501_16988
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Marginal and Conditional Importance Measures from Machine Learning Models and Their Relationship with Conditional Average Treatment Effect
Khan, Mohammad Kaviul Anam
Saarela, Olli
Kustra, Rafal
Machine Learning
Interpreting black-box machine learning models is challenging due to their strong dependence on data and inherently non-parametric nature. This paper reintroduces the concept of importance through "Marginal Variable Importance Metric" (MVIM), a model-agnostic measure of predictor importance based on the true conditional expectation function. MVIM evaluates predictors' influence on continuous or discrete outcomes. A permutation-based estimation approach, inspired by \citet{breiman2001random} and \citet{fisher2019all}, is proposed to estimate MVIM. MVIM estimator is biased when predictors are highly correlated, as black-box models struggle to extrapolate in low-probability regions. To address this, we investigated the bias-variance decomposition of MVIM to understand the source and pattern of the bias under high correlation. A Conditional Variable Importance Metric (CVIM), adapted from \citet{strobl2008conditional}, is introduced to reduce this bias. Both MVIM and CVIM exhibit a quadratic relationship with the conditional average treatment effect (CATE).
title Marginal and Conditional Importance Measures from Machine Learning Models and Their Relationship with Conditional Average Treatment Effect
topic Machine Learning
url https://arxiv.org/abs/2501.16988