Saved in:
Bibliographic Details
Main Authors: Gao, Ziyu, Wu, Xinyuan, Chen, Xiaolan, Liu, Zhuoran, Chen, Ruoyu, Liu, Bowen, Yan, Bingjie, Wang, Zhenhan, Jin, Kai, Yang, Jiancheng, Tham, Yih Chung, He, Mingguang, Shi, Danli
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.14039
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912966241157120
author Gao, Ziyu
Wu, Xinyuan
Chen, Xiaolan
Liu, Zhuoran
Chen, Ruoyu
Liu, Bowen
Yan, Bingjie
Wang, Zhenhan
Jin, Kai
Yang, Jiancheng
Tham, Yih Chung
He, Mingguang
Shi, Danli
author_facet Gao, Ziyu
Wu, Xinyuan
Chen, Xiaolan
Liu, Zhuoran
Chen, Ruoyu
Liu, Bowen
Yan, Bingjie
Wang, Zhenhan
Jin, Kai
Yang, Jiancheng
Tham, Yih Chung
He, Mingguang
Shi, Danli
contents Ophthalmic decision-making depends on subtle lesion-scale cues interpreted across multimodal imaging and over time, yet most medical foundation models remain static and degrade under modality and acquisition shifts. Here we introduce EyeWorld, a generative world model that conceptualizes the eye as a partially observed dynamical system grounded in clinical imaging. EyeWorld learns an observation-stable latent ocular state shared across modalities, unifying fine-grained parsing, structure-preserving cross-modality translation and quality-robust enhancement within a single framework. Longitudinal supervision further enables time-conditioned state transitions, supporting forecasting of clinically meaningful progression while preserving stable anatomy. By moving from static representation learning to explicit dynamical modeling, EyeWorld provides a unified approach to robust multimodal interpretation and prognosis-oriented simulation in medicine.
format Preprint
id arxiv_https___arxiv_org_abs_2603_14039
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle EyeWorld: A Generative World Model of Ocular State and Dynamics
Gao, Ziyu
Wu, Xinyuan
Chen, Xiaolan
Liu, Zhuoran
Chen, Ruoyu
Liu, Bowen
Yan, Bingjie
Wang, Zhenhan
Jin, Kai
Yang, Jiancheng
Tham, Yih Chung
He, Mingguang
Shi, Danli
Computer Vision and Pattern Recognition
Ophthalmic decision-making depends on subtle lesion-scale cues interpreted across multimodal imaging and over time, yet most medical foundation models remain static and degrade under modality and acquisition shifts. Here we introduce EyeWorld, a generative world model that conceptualizes the eye as a partially observed dynamical system grounded in clinical imaging. EyeWorld learns an observation-stable latent ocular state shared across modalities, unifying fine-grained parsing, structure-preserving cross-modality translation and quality-robust enhancement within a single framework. Longitudinal supervision further enables time-conditioned state transitions, supporting forecasting of clinically meaningful progression while preserving stable anatomy. By moving from static representation learning to explicit dynamical modeling, EyeWorld provides a unified approach to robust multimodal interpretation and prognosis-oriented simulation in medicine.
title EyeWorld: A Generative World Model of Ocular State and Dynamics
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2603.14039