Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Chen, Chun-Hung
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence Computer Vision and Pattern Recognition Neural and Evolutionary Computing
Online Access:	https://arxiv.org/abs/2504.00515
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915221458649088
author	Chen, Chun-Hung
author_facet	Chen, Chun-Hung
contents	Accurate measurement of eyelid parameters such as Margin Reflex Distances (MRD1, MRD2) and Levator Function (LF) is critical in oculoplastic diagnostics but remains limited by manual, inconsistent methods. This study evaluates deep learning models: SE-ResNet, EfficientNet, and the vision transformer-based DINOv2 for automating these measurements using smartphone-acquired images. We assess performance across frozen and fine-tuned settings, using MSE, MAE, and R2 metrics. DINOv2, pretrained through self-supervised learning, demonstrates superior scalability and robustness, especially under frozen conditions ideal for mobile deployment. Lightweight regressors such as MLP and Deep Ensemble offer high precision with minimal computational overhead. To address class imbalance and improve generalization, we integrate focal loss, orthogonal regularization, and binary encoding strategies. Our results show that DINOv2 combined with these enhancements delivers consistent, accurate predictions across all tasks, making it a strong candidate for real-world, mobile-friendly clinical applications. This work highlights the potential of foundation models in advancing AI-powered ophthalmic care.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_00515
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Training Frozen Feature Pyramid DINOv2 for Eyelid Measurements with Infinite Encoding and Orthogonal Regularization Chen, Chun-Hung Machine Learning Artificial Intelligence Computer Vision and Pattern Recognition Neural and Evolutionary Computing Accurate measurement of eyelid parameters such as Margin Reflex Distances (MRD1, MRD2) and Levator Function (LF) is critical in oculoplastic diagnostics but remains limited by manual, inconsistent methods. This study evaluates deep learning models: SE-ResNet, EfficientNet, and the vision transformer-based DINOv2 for automating these measurements using smartphone-acquired images. We assess performance across frozen and fine-tuned settings, using MSE, MAE, and R2 metrics. DINOv2, pretrained through self-supervised learning, demonstrates superior scalability and robustness, especially under frozen conditions ideal for mobile deployment. Lightweight regressors such as MLP and Deep Ensemble offer high precision with minimal computational overhead. To address class imbalance and improve generalization, we integrate focal loss, orthogonal regularization, and binary encoding strategies. Our results show that DINOv2 combined with these enhancements delivers consistent, accurate predictions across all tasks, making it a strong candidate for real-world, mobile-friendly clinical applications. This work highlights the potential of foundation models in advancing AI-powered ophthalmic care.
title	Training Frozen Feature Pyramid DINOv2 for Eyelid Measurements with Infinite Encoding and Orthogonal Regularization
topic	Machine Learning Artificial Intelligence Computer Vision and Pattern Recognition Neural and Evolutionary Computing
url	https://arxiv.org/abs/2504.00515

Similar Items