Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Strobl, Eric V.
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Quantitative Methods
Online Access:	https://arxiv.org/abs/2602.23459
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918359082205184
author	Strobl, Eric V.
author_facet	Strobl, Eric V.
contents	Psychiatric questionnaires are highly context sensitive and often only weakly predict subsequent symptom severity, which makes the prognostic relationship difficult to learn. Although flexible nonlinear models can improve predictive accuracy, their limited interpretability can erode clinical trust. In fields such as imaging and omics, investigators commonly address visit- and instrument-specific artifacts by extracting stable signal through preprocessing and then fitting an interpretable linear model. We adopt the same strategy for questionnaire data by decoupling preprocessing from prediction: we restrict nonlinear capacity to a baseline preprocessing module that estimates stable item values, and then learn a linear mapping from these stabilized baseline items to future severity. We refer to this two-stage method as REFINE (Redundancy-Exploiting Follow-up-Informed Nonlinear Enhancement), which concentrates nonlinearity in preprocessing while keeping the prognostic relationship transparently linear and therefore globally interpretable through a coefficient matrix, rather than through post hoc local attributions. In experiments, REFINE outperforms other interpretable approaches while preserving clear global attribution of prognostic factors across psychiatric and non-psychiatric longitudinal prediction tasks.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_23459
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Global Interpretability via Automated Preprocessing: A Framework Inspired by Psychiatric Questionnaires Strobl, Eric V. Machine Learning Quantitative Methods Psychiatric questionnaires are highly context sensitive and often only weakly predict subsequent symptom severity, which makes the prognostic relationship difficult to learn. Although flexible nonlinear models can improve predictive accuracy, their limited interpretability can erode clinical trust. In fields such as imaging and omics, investigators commonly address visit- and instrument-specific artifacts by extracting stable signal through preprocessing and then fitting an interpretable linear model. We adopt the same strategy for questionnaire data by decoupling preprocessing from prediction: we restrict nonlinear capacity to a baseline preprocessing module that estimates stable item values, and then learn a linear mapping from these stabilized baseline items to future severity. We refer to this two-stage method as REFINE (Redundancy-Exploiting Follow-up-Informed Nonlinear Enhancement), which concentrates nonlinearity in preprocessing while keeping the prognostic relationship transparently linear and therefore globally interpretable through a coefficient matrix, rather than through post hoc local attributions. In experiments, REFINE outperforms other interpretable approaches while preserving clear global attribution of prognostic factors across psychiatric and non-psychiatric longitudinal prediction tasks.
title	Global Interpretability via Automated Preprocessing: A Framework Inspired by Psychiatric Questionnaires
topic	Machine Learning Quantitative Methods
url	https://arxiv.org/abs/2602.23459

Similar Items