Enregistré dans:
Détails bibliographiques
Auteurs principaux: Robinet, Lucas, Berjaoui, Ahmad, Moyal, Elizabeth Cohen-Jonathan
Format: Preprint
Publié: 2025
Sujets:
Accès en ligne:https://arxiv.org/abs/2508.00969
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866917148018868224
author Robinet, Lucas
Berjaoui, Ahmad
Moyal, Elizabeth Cohen-Jonathan
author_facet Robinet, Lucas
Berjaoui, Ahmad
Moyal, Elizabeth Cohen-Jonathan
contents Self-supervised learning (SSL) has driven major advances in computational pathology by enabling the learning of rich representations from histopathology data. Yet, tissue analysis alone may fall short in capturing broader molecular complexity, as key complementary information resides in high-dimensional omics profiles such as transcriptomics, methylomics, and genomics. To address this gap, we introduce MORPHEUS, the first multimodal pre-training strategy that integrates histopathology images and multi-omics data within a shared transformer-based architecture. At its core, MORPHEUS relies on a novel masked omics modeling objective that encourages the model to learn meaningful cross-modal relationships. This yields a general-purpose pre-trained encoder that can be applied to histopathology alone or in combination with any subset of omics modalities. Beyond inference, MORPHEUS also supports flexible any-to-any omics reconstruction, enabling one or more omics profiles to be reconstructed from any modality subset that includes histopathology. Pre-trained on a large pan-cancer cohort, MORPHEUS shows substantial improvements over supervised and SSL baselines across diverse tasks and modality combinations. Together, these capabilities position it as a promising direction for the development of multimodal foundation models in oncology. Code is publicly available at https://github.com/Lucas-rbnt/MORPHEUS
format Preprint
id arxiv_https___arxiv_org_abs_2508_00969
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Masked Omics Modeling for Multimodal Representation Learning across Histopathology and Molecular Profiles
Robinet, Lucas
Berjaoui, Ahmad
Moyal, Elizabeth Cohen-Jonathan
Machine Learning
Artificial Intelligence
Self-supervised learning (SSL) has driven major advances in computational pathology by enabling the learning of rich representations from histopathology data. Yet, tissue analysis alone may fall short in capturing broader molecular complexity, as key complementary information resides in high-dimensional omics profiles such as transcriptomics, methylomics, and genomics. To address this gap, we introduce MORPHEUS, the first multimodal pre-training strategy that integrates histopathology images and multi-omics data within a shared transformer-based architecture. At its core, MORPHEUS relies on a novel masked omics modeling objective that encourages the model to learn meaningful cross-modal relationships. This yields a general-purpose pre-trained encoder that can be applied to histopathology alone or in combination with any subset of omics modalities. Beyond inference, MORPHEUS also supports flexible any-to-any omics reconstruction, enabling one or more omics profiles to be reconstructed from any modality subset that includes histopathology. Pre-trained on a large pan-cancer cohort, MORPHEUS shows substantial improvements over supervised and SSL baselines across diverse tasks and modality combinations. Together, these capabilities position it as a promising direction for the development of multimodal foundation models in oncology. Code is publicly available at https://github.com/Lucas-rbnt/MORPHEUS
title Masked Omics Modeling for Multimodal Representation Learning across Histopathology and Molecular Profiles
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2508.00969