Saved in:
Bibliographic Details
Main Authors: Aziz, Abu Zahid Bin, Ahmed, Syed Fahim, Rasineni, Gnanesh, Wang, Mei, Hatipoglu, Olcaytu, Ricci, Marisa, Shaw, Malaiyah, Li, Guang, Brown, J. Quincy, Pascucci, Valerio, Elhabian, Shireen
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.10334
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918470208192512
author Aziz, Abu Zahid Bin
Ahmed, Syed Fahim
Rasineni, Gnanesh
Wang, Mei
Hatipoglu, Olcaytu
Ricci, Marisa
Shaw, Malaiyah
Li, Guang
Brown, J. Quincy
Pascucci, Valerio
Elhabian, Shireen
author_facet Aziz, Abu Zahid Bin
Ahmed, Syed Fahim
Rasineni, Gnanesh
Wang, Mei
Hatipoglu, Olcaytu
Ricci, Marisa
Shaw, Malaiyah
Li, Guang
Brown, J. Quincy
Pascucci, Valerio
Elhabian, Shireen
contents Structured Illumination Microscopy (SIM) enables rapid, high-contrast optical sectioning of fresh tissue without staining or physical sectioning, making it promising for intraoperative and point-of-care diagnostics. Recent foundation and large-scale self-supervised models in digital pathology have demonstrated strong performance on section-based modalities such as Hematoxylin and Eosin (H&E) and immunohistochemistry (IHC). However, these approaches are predominantly trained on thin tissue sections and do not explicitly address thick-tissue fluorescence modalities such as SIM. When transferred directly to SIM, performance is constrained by substantial modality shift, and naive fine-tuning often overfits to modality-specific appearance rather than underlying histological structure. We introduce SIMPLER (Structured Illumination Microscopy-Powered Learning for Embedding Representations), a cross-modality self-supervised pretraining framework that leverages H&E as a semantic anchor to learn reusable SIM representations. H&E encodes rich cellular and glandular structure aligned with established clinical annotations, while SIM provides rapid, nondestructive imaging of fresh tissue. During pretraining, SIM and H&E are progressively aligned through adversarial, contrastive, and reconstruction-based objectives, encouraging SIM embeddings to internalize histological structure from H&E without collapsing modality-specific characteristics. A single pretrained SIMPLER encoder transfers across multiple downstream tasks, including multiple instance learning and morphological clustering, consistently outperforming SIM models trained from scratch or H&E-only pretraining. These results suggest that histology-guided cross-modal pretraining yields biologically grounded SIM embeddings suitable for broad downstream reuse.
format Preprint
id arxiv_https___arxiv_org_abs_2604_10334
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle SIMPLER: H&E-Informed Representation Learning for Structured Illumination Microscopy
Aziz, Abu Zahid Bin
Ahmed, Syed Fahim
Rasineni, Gnanesh
Wang, Mei
Hatipoglu, Olcaytu
Ricci, Marisa
Shaw, Malaiyah
Li, Guang
Brown, J. Quincy
Pascucci, Valerio
Elhabian, Shireen
Computer Vision and Pattern Recognition
Structured Illumination Microscopy (SIM) enables rapid, high-contrast optical sectioning of fresh tissue without staining or physical sectioning, making it promising for intraoperative and point-of-care diagnostics. Recent foundation and large-scale self-supervised models in digital pathology have demonstrated strong performance on section-based modalities such as Hematoxylin and Eosin (H&E) and immunohistochemistry (IHC). However, these approaches are predominantly trained on thin tissue sections and do not explicitly address thick-tissue fluorescence modalities such as SIM. When transferred directly to SIM, performance is constrained by substantial modality shift, and naive fine-tuning often overfits to modality-specific appearance rather than underlying histological structure. We introduce SIMPLER (Structured Illumination Microscopy-Powered Learning for Embedding Representations), a cross-modality self-supervised pretraining framework that leverages H&E as a semantic anchor to learn reusable SIM representations. H&E encodes rich cellular and glandular structure aligned with established clinical annotations, while SIM provides rapid, nondestructive imaging of fresh tissue. During pretraining, SIM and H&E are progressively aligned through adversarial, contrastive, and reconstruction-based objectives, encouraging SIM embeddings to internalize histological structure from H&E without collapsing modality-specific characteristics. A single pretrained SIMPLER encoder transfers across multiple downstream tasks, including multiple instance learning and morphological clustering, consistently outperforming SIM models trained from scratch or H&E-only pretraining. These results suggest that histology-guided cross-modal pretraining yields biologically grounded SIM embeddings suitable for broad downstream reuse.
title SIMPLER: H&E-Informed Representation Learning for Structured Illumination Microscopy
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2604.10334