Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ramchandani, Lavish, Tinaikar, Aashay, Das, Dev Kumar, Garg, Rohit, Thomas, Tijo
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.18747
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915810975416320
author	Ramchandani, Lavish Tinaikar, Aashay Das, Dev Kumar Garg, Rohit Thomas, Tijo
author_facet	Ramchandani, Lavish Tinaikar, Aashay Das, Dev Kumar Garg, Rohit Thomas, Tijo
contents	In recent years, foundation models such as CLIP, DINO,and CONCH have demonstrated remarkable domain generalization and unsupervised feature extraction capabilities across diverse imaging tasks. However, systematic and independent evaluations of these models for pixel-level semantic segmentation in histopathology remain scarce. In this study, we propose a robust benchmarking approach to asses 10 foundational models on four histopathological datasets covering both morphological tissue-region and cellular/nuclear segmentation tasks. Our method leverages attention maps of foundation models as pixel-wise features, which are then classified using a machine learning algorithm, XGBoost, enabling fast, interpretable, and model-agnostic evaluation without finetuning. We show that the vision language foundation model, CONCH performed the best across datasets when compared to vision-only foundation models, with PathDino as close second. Further analysis shows that models trained on distinct histopathology cohorts capture complementary morphological representations, and concatenating their features yields superior segmentation performance. Concatenating features from CONCH, PathDino and CellViT outperformed individual models across all the datasets by 7.95% (averaged across the datasets), suggesting that ensembles of foundation models can better generalize to diverse histopathological segmentation tasks.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_18747
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Benchmarking Computational Pathology Foundation Models For Semantic Segmentation Ramchandani, Lavish Tinaikar, Aashay Das, Dev Kumar Garg, Rohit Thomas, Tijo Computer Vision and Pattern Recognition In recent years, foundation models such as CLIP, DINO,and CONCH have demonstrated remarkable domain generalization and unsupervised feature extraction capabilities across diverse imaging tasks. However, systematic and independent evaluations of these models for pixel-level semantic segmentation in histopathology remain scarce. In this study, we propose a robust benchmarking approach to asses 10 foundational models on four histopathological datasets covering both morphological tissue-region and cellular/nuclear segmentation tasks. Our method leverages attention maps of foundation models as pixel-wise features, which are then classified using a machine learning algorithm, XGBoost, enabling fast, interpretable, and model-agnostic evaluation without finetuning. We show that the vision language foundation model, CONCH performed the best across datasets when compared to vision-only foundation models, with PathDino as close second. Further analysis shows that models trained on distinct histopathology cohorts capture complementary morphological representations, and concatenating their features yields superior segmentation performance. Concatenating features from CONCH, PathDino and CellViT outperformed individual models across all the datasets by 7.95% (averaged across the datasets), suggesting that ensembles of foundation models can better generalize to diverse histopathological segmentation tasks.
title	Benchmarking Computational Pathology Foundation Models For Semantic Segmentation
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2602.18747

Similar Items