Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.16988 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866916587535073280 |
|---|---|
| author | Khan, Mohammad Kaviul Anam Saarela, Olli Kustra, Rafal |
| author_facet | Khan, Mohammad Kaviul Anam Saarela, Olli Kustra, Rafal |
| contents | Interpreting black-box machine learning models is challenging due to their strong dependence on data and inherently non-parametric nature. This paper reintroduces the concept of importance through "Marginal Variable Importance Metric" (MVIM), a model-agnostic measure of predictor importance based on the true conditional expectation function. MVIM evaluates predictors' influence on continuous or discrete outcomes. A permutation-based estimation approach, inspired by \citet{breiman2001random} and \citet{fisher2019all}, is proposed to estimate MVIM. MVIM estimator is biased when predictors are highly correlated, as black-box models struggle to extrapolate in low-probability regions. To address this, we investigated the bias-variance decomposition of MVIM to understand the source and pattern of the bias under high correlation. A Conditional Variable Importance Metric (CVIM), adapted from \citet{strobl2008conditional}, is introduced to reduce this bias. Both MVIM and CVIM exhibit a quadratic relationship with the conditional average treatment effect (CATE). |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2501_16988 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Marginal and Conditional Importance Measures from Machine Learning Models and Their Relationship with Conditional Average Treatment Effect Khan, Mohammad Kaviul Anam Saarela, Olli Kustra, Rafal Machine Learning Interpreting black-box machine learning models is challenging due to their strong dependence on data and inherently non-parametric nature. This paper reintroduces the concept of importance through "Marginal Variable Importance Metric" (MVIM), a model-agnostic measure of predictor importance based on the true conditional expectation function. MVIM evaluates predictors' influence on continuous or discrete outcomes. A permutation-based estimation approach, inspired by \citet{breiman2001random} and \citet{fisher2019all}, is proposed to estimate MVIM. MVIM estimator is biased when predictors are highly correlated, as black-box models struggle to extrapolate in low-probability regions. To address this, we investigated the bias-variance decomposition of MVIM to understand the source and pattern of the bias under high correlation. A Conditional Variable Importance Metric (CVIM), adapted from \citet{strobl2008conditional}, is introduced to reduce this bias. Both MVIM and CVIM exhibit a quadratic relationship with the conditional average treatment effect (CATE). |
| title | Marginal and Conditional Importance Measures from Machine Learning Models and Their Relationship with Conditional Average Treatment Effect |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2501.16988 |