_version_ 1866910792229584896
author Yew, Samantha Min Er
Lei, Xiaofeng
Goh, Jocelyn Hui Lin
Chen, Yibing
Srinivasan, Sahana
Chee, Miao-li
Pushpanathan, Krithi
Zou, Ke
Hou, Qingshan
Da Soh, Zhi
Xue, Cancan
Yu, Marco Chak Yan
Sabanayagam, Charumathi
Tai, E Shyong
Sim, Xueling
Wang, Yaxing
Jonas, Jost B.
Nangia, Vinay
Yang, Gabriel Dawei
Ran, Emma Anran
Cheung, Carol Yim-Lui
Feng, Yangqin
Zhou, Jun
Goh, Rick Siow Mong
Zhou, Yukun
Keane, Pearse A.
Liu, Yong
Cheng, Ching-Yu
Tham, Yih-Chung
author_facet Yew, Samantha Min Er
Lei, Xiaofeng
Goh, Jocelyn Hui Lin
Chen, Yibing
Srinivasan, Sahana
Chee, Miao-li
Pushpanathan, Krithi
Zou, Ke
Hou, Qingshan
Da Soh, Zhi
Xue, Cancan
Yu, Marco Chak Yan
Sabanayagam, Charumathi
Tai, E Shyong
Sim, Xueling
Wang, Yaxing
Jonas, Jost B.
Nangia, Vinay
Yang, Gabriel Dawei
Ran, Emma Anran
Cheung, Carol Yim-Lui
Feng, Yangqin
Zhou, Jun
Goh, Rick Siow Mong
Zhou, Yukun
Keane, Pearse A.
Liu, Yong
Cheng, Ching-Yu
Tham, Yih-Chung
contents Background: RETFound, a self-supervised, retina-specific foundation model (FM), showed potential in downstream applications. However, its comparative performance with traditional deep learning (DL) models remains incompletely understood. This study aimed to evaluate RETFound against three ImageNet-pretrained supervised DL models (ResNet50, ViT-base, SwinV2) in detecting ocular and systemic diseases. Methods: We fine-tuned/trained RETFound and three DL models on full datasets, 50%, 20%, and fixed sample sizes (400, 200, 100 images, with half comprising disease cases; for each DR severity class, 100 and 50 cases were used. Fine-tuned models were tested internally using the SEED (53,090 images) and APTOS-2019 (3,672 images) datasets and externally validated on population-based (BES, CIEMS, SP2, UKBB) and open-source datasets (ODIR-5k, PAPILA, GAMMA, IDRiD, MESSIDOR-2). Model performance was compared using area under the receiver operating characteristic curve (AUC) and Z-tests with Bonferroni correction (P<0.05/3). Interpretation: Traditional DL models are mostly comparable to RETFound for ocular disease detection with large datasets. However, RETFound is superior in systemic disease detection with smaller datasets. These findings offer valuable insights into the respective merits and limitation of traditional models and FMs.
format Preprint
id arxiv_https___arxiv_org_abs_2501_12016
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Are Traditional Deep Learning Model Approaches as Effective as a Retinal-Specific Foundation Model for Ocular and Systemic Disease Detection?
Yew, Samantha Min Er
Lei, Xiaofeng
Goh, Jocelyn Hui Lin
Chen, Yibing
Srinivasan, Sahana
Chee, Miao-li
Pushpanathan, Krithi
Zou, Ke
Hou, Qingshan
Da Soh, Zhi
Xue, Cancan
Yu, Marco Chak Yan
Sabanayagam, Charumathi
Tai, E Shyong
Sim, Xueling
Wang, Yaxing
Jonas, Jost B.
Nangia, Vinay
Yang, Gabriel Dawei
Ran, Emma Anran
Cheung, Carol Yim-Lui
Feng, Yangqin
Zhou, Jun
Goh, Rick Siow Mong
Zhou, Yukun
Keane, Pearse A.
Liu, Yong
Cheng, Ching-Yu
Tham, Yih-Chung
Computer Vision and Pattern Recognition
Machine Learning
Background: RETFound, a self-supervised, retina-specific foundation model (FM), showed potential in downstream applications. However, its comparative performance with traditional deep learning (DL) models remains incompletely understood. This study aimed to evaluate RETFound against three ImageNet-pretrained supervised DL models (ResNet50, ViT-base, SwinV2) in detecting ocular and systemic diseases. Methods: We fine-tuned/trained RETFound and three DL models on full datasets, 50%, 20%, and fixed sample sizes (400, 200, 100 images, with half comprising disease cases; for each DR severity class, 100 and 50 cases were used. Fine-tuned models were tested internally using the SEED (53,090 images) and APTOS-2019 (3,672 images) datasets and externally validated on population-based (BES, CIEMS, SP2, UKBB) and open-source datasets (ODIR-5k, PAPILA, GAMMA, IDRiD, MESSIDOR-2). Model performance was compared using area under the receiver operating characteristic curve (AUC) and Z-tests with Bonferroni correction (P<0.05/3). Interpretation: Traditional DL models are mostly comparable to RETFound for ocular disease detection with large datasets. However, RETFound is superior in systemic disease detection with smaller datasets. These findings offer valuable insights into the respective merits and limitation of traditional models and FMs.
title Are Traditional Deep Learning Model Approaches as Effective as a Retinal-Specific Foundation Model for Ocular and Systemic Disease Detection?
topic Computer Vision and Pattern Recognition
Machine Learning
url https://arxiv.org/abs/2501.12016