Saved in:
| Main Author: | |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.06609 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866917319148568576 |
|---|---|
| author | Salem, Mohamed |
| author_facet | Salem, Mohamed |
| contents | Modern machine learning models are highly expressive but notoriously difficult to analyze statistically. In particular, while black-box predictors can achieve strong empirical performance, they rarely provide valid hypothesis tests or p-values for assessing whether individual features contain information about a target variable. This article presents a practical approach to feature-level hypothesis testing that combines the Conditional Randomization Test (CRT) with TabPFN, a probabilistic foundation model for tabular data. The resulting procedure yields finite-sample valid p-values for conditional feature relevance, even in nonlinear and correlated settings, without requiring model retraining or parametric assumptions. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2603_06609 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Valid Feature-Level Inference for Tabular Foundation Models via the Conditional Randomization Test Salem, Mohamed Machine Learning Modern machine learning models are highly expressive but notoriously difficult to analyze statistically. In particular, while black-box predictors can achieve strong empirical performance, they rarely provide valid hypothesis tests or p-values for assessing whether individual features contain information about a target variable. This article presents a practical approach to feature-level hypothesis testing that combines the Conditional Randomization Test (CRT) with TabPFN, a probabilistic foundation model for tabular data. The resulting procedure yields finite-sample valid p-values for conditional feature relevance, even in nonlinear and correlated settings, without requiring model retraining or parametric assumptions. |
| title | Valid Feature-Level Inference for Tabular Foundation Models via the Conditional Randomization Test |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2603.06609 |