Saved in:
Bibliographic Details
Main Authors: Zhang, Yang, Fan, Li, Lawrence, Samuel, Li, Shi
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.20921
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917356448514048
author Zhang, Yang
Fan, Li
Lawrence, Samuel
Li, Shi
author_facet Zhang, Yang
Fan, Li
Lawrence, Samuel
Li, Shi
contents Foundation models in healthcare have largely adopted self supervised pretraining objectives inherited from natural language processing and computer vision, emphasizing reconstruction and large scale representation learning prior to downstream adaptation. We revisit this paradigm in outcome centric clinical prediction settings and argue that, when high quality supervision is available, direct outcome alignment may provide a stronger inductive bias than generative pretraining. We propose a supervised deep learning framework that explicitly shapes representation geometry by maximizing inter class separation relative to within class variance, thereby concentrating model capacity along clinically meaningful axes. Across multiple longitudinal electronic health record tasks, including mortality and readmission prediction, our approach consistently outperforms masked, autoregressive, and contrastive pretraining baselines under matched model capacity. The proposed method improves discrimination, calibration, and sample efficiency, while simplifying the training pipeline to a single stage optimization. These findings suggest that in low entropy, outcome driven healthcare domains, supervision can act as the statistically optimal driver of representation learning, challenging the assumption that large scale self supervised pretraining is a prerequisite for strong clinical performance.
format Preprint
id arxiv_https___arxiv_org_abs_2603_20921
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Discriminative Representation Learning for Clinical Prediction
Zhang, Yang
Fan, Li
Lawrence, Samuel
Li, Shi
Machine Learning
Foundation models in healthcare have largely adopted self supervised pretraining objectives inherited from natural language processing and computer vision, emphasizing reconstruction and large scale representation learning prior to downstream adaptation. We revisit this paradigm in outcome centric clinical prediction settings and argue that, when high quality supervision is available, direct outcome alignment may provide a stronger inductive bias than generative pretraining. We propose a supervised deep learning framework that explicitly shapes representation geometry by maximizing inter class separation relative to within class variance, thereby concentrating model capacity along clinically meaningful axes. Across multiple longitudinal electronic health record tasks, including mortality and readmission prediction, our approach consistently outperforms masked, autoregressive, and contrastive pretraining baselines under matched model capacity. The proposed method improves discrimination, calibration, and sample efficiency, while simplifying the training pipeline to a single stage optimization. These findings suggest that in low entropy, outcome driven healthcare domains, supervision can act as the statistically optimal driver of representation learning, challenging the assumption that large scale self supervised pretraining is a prerequisite for strong clinical performance.
title Discriminative Representation Learning for Clinical Prediction
topic Machine Learning
url https://arxiv.org/abs/2603.20921