Saved in:
Bibliographic Details
Main Authors: Lucassen, Ruben T., Moonemans, Sander P. J., van de Luijtgaarden, Tijn, Breimer, Gerben E., Blokx, Willeke A. M., Veta, Mitko
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.19293
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909513991323648
author Lucassen, Ruben T.
Moonemans, Sander P. J.
van de Luijtgaarden, Tijn
Breimer, Gerben E.
Blokx, Willeke A. M.
Veta, Mitko
author_facet Lucassen, Ruben T.
Moonemans, Sander P. J.
van de Luijtgaarden, Tijn
Breimer, Gerben E.
Blokx, Willeke A. M.
Veta, Mitko
contents Millions of melanocytic skin lesions are examined by pathologists each year, the majority of which concern common nevi (i.e., ordinary moles). While most of these lesions can be diagnosed in seconds, writing the corresponding pathology report is much more time-consuming. Automating part of the report writing could, therefore, alleviate the increasing workload of pathologists. In this work, we develop a vision-language model specifically for the pathology domain of cutaneous melanocytic lesions. The model follows the Contrastive Captioner framework and was trained and evaluated using a melanocytic lesion dataset of 42,512 H&E-stained whole slide images and 19,645 corresponding pathology reports. Our results show that the quality scores of model-generated reports were on par with pathologist-written reports for common nevi, assessed by an expert pathologist in a reader study. While report generation revealed to be more difficult for rare melanocytic lesion subtypes, the cross-modal retrieval performance for these cases was considerably better.
format Preprint
id arxiv_https___arxiv_org_abs_2502_19293
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Pathology Report Generation and Multimodal Representation Learning for Cutaneous Melanocytic Lesions
Lucassen, Ruben T.
Moonemans, Sander P. J.
van de Luijtgaarden, Tijn
Breimer, Gerben E.
Blokx, Willeke A. M.
Veta, Mitko
Computer Vision and Pattern Recognition
Millions of melanocytic skin lesions are examined by pathologists each year, the majority of which concern common nevi (i.e., ordinary moles). While most of these lesions can be diagnosed in seconds, writing the corresponding pathology report is much more time-consuming. Automating part of the report writing could, therefore, alleviate the increasing workload of pathologists. In this work, we develop a vision-language model specifically for the pathology domain of cutaneous melanocytic lesions. The model follows the Contrastive Captioner framework and was trained and evaluated using a melanocytic lesion dataset of 42,512 H&E-stained whole slide images and 19,645 corresponding pathology reports. Our results show that the quality scores of model-generated reports were on par with pathologist-written reports for common nevi, assessed by an expert pathologist in a reader study. While report generation revealed to be more difficult for rare melanocytic lesion subtypes, the cross-modal retrieval performance for these cases was considerably better.
title Pathology Report Generation and Multimodal Representation Learning for Cutaneous Melanocytic Lesions
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2502.19293