Saved in:
Bibliographic Details
Main Authors: Winter, Dominik, Bui, Mai, Gavaldon, Monica Azqueta, Triltsch, Nicolas, Rosati, Marco, Brieu, Nicolas
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2510.09121
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915562746019840
author Winter, Dominik
Bui, Mai
Gavaldon, Monica Azqueta
Triltsch, Nicolas
Rosati, Marco
Brieu, Nicolas
author_facet Winter, Dominik
Bui, Mai
Gavaldon, Monica Azqueta
Triltsch, Nicolas
Rosati, Marco
Brieu, Nicolas
contents Scarcity of annotated data, particularly for rare or atypical morphologies, present significant challenges for cell and nuclei segmentation in computational pathology. While manual annotation is labor-intensive and costly, synthetic data offers a cost-effective alternative. We introduce a Multimodal Semantic Diffusion Model (MSDM) for generating realistic pixel-precise image-mask pairs for cell and nuclei segmentation. By conditioning the generative process with cellular/nuclear morphologies (using horizontal and vertical maps), RGB color characteristics, and BERT-encoded assay/indication metadata, MSDM generates datasests with desired morphological properties. These heterogeneous modalities are integrated via multi-head cross-attention, enabling fine-grained control over the generated images. Quantitative analysis demonstrates that synthetic images closely match real data, with low Wasserstein distances between embeddings of generated and real images under matching biological conditions. The incorporation of these synthetic samples, exemplified by columnar cells, significantly improves segmentation model accuracy on columnar cells. This strategy systematically enriches data sets, directly targeting model deficiencies. We highlight the effectiveness of multimodal diffusion-based augmentation for advancing the robustness and generalizability of cell and nuclei segmentation models. Thereby, we pave the way for broader application of generative models in computational pathology.
format Preprint
id arxiv_https___arxiv_org_abs_2510_09121
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle MSDM: Generating Task-Specific Pathology Images with a Multimodal Conditioned Diffusion Model for Cell and Nuclei Segmentation
Winter, Dominik
Bui, Mai
Gavaldon, Monica Azqueta
Triltsch, Nicolas
Rosati, Marco
Brieu, Nicolas
Computer Vision and Pattern Recognition
Artificial Intelligence
Scarcity of annotated data, particularly for rare or atypical morphologies, present significant challenges for cell and nuclei segmentation in computational pathology. While manual annotation is labor-intensive and costly, synthetic data offers a cost-effective alternative. We introduce a Multimodal Semantic Diffusion Model (MSDM) for generating realistic pixel-precise image-mask pairs for cell and nuclei segmentation. By conditioning the generative process with cellular/nuclear morphologies (using horizontal and vertical maps), RGB color characteristics, and BERT-encoded assay/indication metadata, MSDM generates datasests with desired morphological properties. These heterogeneous modalities are integrated via multi-head cross-attention, enabling fine-grained control over the generated images. Quantitative analysis demonstrates that synthetic images closely match real data, with low Wasserstein distances between embeddings of generated and real images under matching biological conditions. The incorporation of these synthetic samples, exemplified by columnar cells, significantly improves segmentation model accuracy on columnar cells. This strategy systematically enriches data sets, directly targeting model deficiencies. We highlight the effectiveness of multimodal diffusion-based augmentation for advancing the robustness and generalizability of cell and nuclei segmentation models. Thereby, we pave the way for broader application of generative models in computational pathology.
title MSDM: Generating Task-Specific Pathology Images with a Multimodal Conditioned Diffusion Model for Cell and Nuclei Segmentation
topic Computer Vision and Pattern Recognition
Artificial Intelligence
url https://arxiv.org/abs/2510.09121