Saved in:
Bibliographic Details
Main Authors: Yan, Fang, Wu, Jianfeng, Li, Jiawen, Wang, Wei, Lu, Jiaxuan, Chen, Wen, Gao, Zizhao, Li, Jianan, Yan, Hong, Ma, Jiabo, Chen, Minda, Lu, Yang, Chen, Qing, Wang, Yizhi, Ling, Xitong, Wang, Xuenian, Wang, Zihan, Huang, Qiang, Hua, Shengyi, Liu, Mianxin, Ma, Lei, Shen, Tian, Zhang, Xiaofan, He, Yonghong, Chen, Hao, Zhang, Shaoting, Wang, Zhe
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.24345
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915219693895680
author Yan, Fang
Wu, Jianfeng
Li, Jiawen
Wang, Wei
Lu, Jiaxuan
Chen, Wen
Gao, Zizhao
Li, Jianan
Yan, Hong
Ma, Jiabo
Chen, Minda
Lu, Yang
Chen, Qing
Wang, Yizhi
Ling, Xitong
Wang, Xuenian
Wang, Zihan
Huang, Qiang
Hua, Shengyi
Liu, Mianxin
Ma, Lei
Shen, Tian
Zhang, Xiaofan
He, Yonghong
Chen, Hao
Zhang, Shaoting
Wang, Zhe
author_facet Yan, Fang
Wu, Jianfeng
Li, Jiawen
Wang, Wei
Lu, Jiaxuan
Chen, Wen
Gao, Zizhao
Li, Jianan
Yan, Hong
Ma, Jiabo
Chen, Minda
Lu, Yang
Chen, Qing
Wang, Yizhi
Ling, Xitong
Wang, Xuenian
Wang, Zihan
Huang, Qiang
Hua, Shengyi
Liu, Mianxin
Ma, Lei
Shen, Tian
Zhang, Xiaofan
He, Yonghong
Chen, Hao
Zhang, Shaoting
Wang, Zhe
contents The complexity and variability inherent in high-resolution pathological images present significant challenges in computational pathology. While pathology foundation models leveraging AI have catalyzed transformative advancements, their development demands large-scale datasets, considerable storage capacity, and substantial computational resources. Furthermore, ensuring their clinical applicability and generalizability requires rigorous validation across a broad spectrum of clinical tasks. Here, we present PathOrchestra, a versatile pathology foundation model trained via self-supervised learning on a dataset comprising 300K pathological slides from 20 tissue and organ types across multiple centers. The model was rigorously evaluated on 112 clinical tasks using a combination of 61 private and 51 public datasets. These tasks encompass digital slide preprocessing, pan-cancer classification, lesion identification, multi-cancer subtype classification, biomarker assessment, gene expression prediction, and the generation of structured reports. PathOrchestra demonstrated exceptional performance across 27,755 WSIs and 9,415,729 ROIs, achieving over 0.950 accuracy in 47 tasks, including pan-cancer classification across various organs, lymphoma subtype diagnosis, and bladder cancer screening. Notably, it is the first model to generate structured reports for high-incidence colorectal cancer and diagnostically complex lymphoma-areas that are infrequently addressed by foundational models but hold immense clinical potential. Overall, PathOrchestra exemplifies the feasibility and efficacy of a large-scale, self-supervised pathology foundation model, validated across a broad range of clinical-grade tasks. Its high accuracy and reduced reliance on extensive data annotation underline its potential for clinical integration, offering a pathway toward more efficient and high-quality medical services.
format Preprint
id arxiv_https___arxiv_org_abs_2503_24345
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks
Yan, Fang
Wu, Jianfeng
Li, Jiawen
Wang, Wei
Lu, Jiaxuan
Chen, Wen
Gao, Zizhao
Li, Jianan
Yan, Hong
Ma, Jiabo
Chen, Minda
Lu, Yang
Chen, Qing
Wang, Yizhi
Ling, Xitong
Wang, Xuenian
Wang, Zihan
Huang, Qiang
Hua, Shengyi
Liu, Mianxin
Ma, Lei
Shen, Tian
Zhang, Xiaofan
He, Yonghong
Chen, Hao
Zhang, Shaoting
Wang, Zhe
Computer Vision and Pattern Recognition
The complexity and variability inherent in high-resolution pathological images present significant challenges in computational pathology. While pathology foundation models leveraging AI have catalyzed transformative advancements, their development demands large-scale datasets, considerable storage capacity, and substantial computational resources. Furthermore, ensuring their clinical applicability and generalizability requires rigorous validation across a broad spectrum of clinical tasks. Here, we present PathOrchestra, a versatile pathology foundation model trained via self-supervised learning on a dataset comprising 300K pathological slides from 20 tissue and organ types across multiple centers. The model was rigorously evaluated on 112 clinical tasks using a combination of 61 private and 51 public datasets. These tasks encompass digital slide preprocessing, pan-cancer classification, lesion identification, multi-cancer subtype classification, biomarker assessment, gene expression prediction, and the generation of structured reports. PathOrchestra demonstrated exceptional performance across 27,755 WSIs and 9,415,729 ROIs, achieving over 0.950 accuracy in 47 tasks, including pan-cancer classification across various organs, lymphoma subtype diagnosis, and bladder cancer screening. Notably, it is the first model to generate structured reports for high-incidence colorectal cancer and diagnostically complex lymphoma-areas that are infrequently addressed by foundational models but hold immense clinical potential. Overall, PathOrchestra exemplifies the feasibility and efficacy of a large-scale, self-supervised pathology foundation model, validated across a broad range of clinical-grade tasks. Its high accuracy and reduced reliance on extensive data annotation underline its potential for clinical integration, offering a pathway toward more efficient and high-quality medical services.
title PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2503.24345