Saved in:
| Main Authors: | , , , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.10334 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866918470208192512 |
|---|---|
| author | Aziz, Abu Zahid Bin Ahmed, Syed Fahim Rasineni, Gnanesh Wang, Mei Hatipoglu, Olcaytu Ricci, Marisa Shaw, Malaiyah Li, Guang Brown, J. Quincy Pascucci, Valerio Elhabian, Shireen |
| author_facet | Aziz, Abu Zahid Bin Ahmed, Syed Fahim Rasineni, Gnanesh Wang, Mei Hatipoglu, Olcaytu Ricci, Marisa Shaw, Malaiyah Li, Guang Brown, J. Quincy Pascucci, Valerio Elhabian, Shireen |
| contents | Structured Illumination Microscopy (SIM) enables rapid, high-contrast optical sectioning of fresh tissue without staining or physical sectioning, making it promising for intraoperative and point-of-care diagnostics. Recent foundation and large-scale self-supervised models in digital pathology have demonstrated strong performance on section-based modalities such as Hematoxylin and Eosin (H&E) and immunohistochemistry (IHC). However, these approaches are predominantly trained on thin tissue sections and do not explicitly address thick-tissue fluorescence modalities such as SIM. When transferred directly to SIM, performance is constrained by substantial modality shift, and naive fine-tuning often overfits to modality-specific appearance rather than underlying histological structure. We introduce SIMPLER (Structured Illumination Microscopy-Powered Learning for Embedding Representations), a cross-modality self-supervised pretraining framework that leverages H&E as a semantic anchor to learn reusable SIM representations. H&E encodes rich cellular and glandular structure aligned with established clinical annotations, while SIM provides rapid, nondestructive imaging of fresh tissue. During pretraining, SIM and H&E are progressively aligned through adversarial, contrastive, and reconstruction-based objectives, encouraging SIM embeddings to internalize histological structure from H&E without collapsing modality-specific characteristics. A single pretrained SIMPLER encoder transfers across multiple downstream tasks, including multiple instance learning and morphological clustering, consistently outperforming SIM models trained from scratch or H&E-only pretraining. These results suggest that histology-guided cross-modal pretraining yields biologically grounded SIM embeddings suitable for broad downstream reuse. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2604_10334 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | SIMPLER: H&E-Informed Representation Learning for Structured Illumination Microscopy Aziz, Abu Zahid Bin Ahmed, Syed Fahim Rasineni, Gnanesh Wang, Mei Hatipoglu, Olcaytu Ricci, Marisa Shaw, Malaiyah Li, Guang Brown, J. Quincy Pascucci, Valerio Elhabian, Shireen Computer Vision and Pattern Recognition Structured Illumination Microscopy (SIM) enables rapid, high-contrast optical sectioning of fresh tissue without staining or physical sectioning, making it promising for intraoperative and point-of-care diagnostics. Recent foundation and large-scale self-supervised models in digital pathology have demonstrated strong performance on section-based modalities such as Hematoxylin and Eosin (H&E) and immunohistochemistry (IHC). However, these approaches are predominantly trained on thin tissue sections and do not explicitly address thick-tissue fluorescence modalities such as SIM. When transferred directly to SIM, performance is constrained by substantial modality shift, and naive fine-tuning often overfits to modality-specific appearance rather than underlying histological structure. We introduce SIMPLER (Structured Illumination Microscopy-Powered Learning for Embedding Representations), a cross-modality self-supervised pretraining framework that leverages H&E as a semantic anchor to learn reusable SIM representations. H&E encodes rich cellular and glandular structure aligned with established clinical annotations, while SIM provides rapid, nondestructive imaging of fresh tissue. During pretraining, SIM and H&E are progressively aligned through adversarial, contrastive, and reconstruction-based objectives, encouraging SIM embeddings to internalize histological structure from H&E without collapsing modality-specific characteristics. A single pretrained SIMPLER encoder transfers across multiple downstream tasks, including multiple instance learning and morphological clustering, consistently outperforming SIM models trained from scratch or H&E-only pretraining. These results suggest that histology-guided cross-modal pretraining yields biologically grounded SIM embeddings suitable for broad downstream reuse. |
| title | SIMPLER: H&E-Informed Representation Learning for Structured Illumination Microscopy |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2604.10334 |