Saved in:
| Main Authors: | Qiu, Jianing, Wu, Jian, Wei, Hao, Shi, Peilun, Zhang, Minqing, Sun, Yunyun, Li, Lin, Liu, Hanruo, Liu, Hongyi, Hou, Simeng, Zhao, Yuyang, Shi, Xuehui, Xian, Junfang, Qu, Xiaoxia, Zhu, Sirui, Pan, Lijie, Chen, Xiaoniao, Zhang, Xiaojia, Jiang, Shuai, Wang, Kebing, Yang, Chenlong, Chen, Mingqiang, Fan, Sujie, Hu, Jianhua, Lv, Aiguo, Miao, Hui, Guo, Li, Zhang, Shujun, Pei, Cheng, Fan, Xiaojuan, Lei, Jianqin, Wei, Ting, Duan, Junguo, Liu, Chun, Xia, Xiaobo, Xiong, Siqi, Li, Junhong, Lo, Benny, Tham, Yih Chung, Wong, Tien Yin, Wang, Ningli, Yuan, Wu |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2310.04992 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
VisionCLIP: An Med-AIGC based Ethical Language-Image Foundation Model for Generalizable Retina Image Analysis
by: Wei, Hao, et al.
Published: (2024)
by: Wei, Hao, et al.
Published: (2024)
ViLReF: An Expert Knowledge Enabled Vision-Language Retinal Foundation Model
by: Yang, Shengzhu, et al.
Published: (2024)
by: Yang, Shengzhu, et al.
Published: (2024)
RetSTA: An LLM-Based Approach for Standardizing Clinical Fundus Image Reports
by: Cai, Jiushen, et al.
Published: (2025)
by: Cai, Jiushen, et al.
Published: (2025)
RET-CLIP: A Retinal Image Foundation Model Pre-trained with Clinical Diagnostic Reports
by: Du, Jiawei, et al.
Published: (2024)
by: Du, Jiawei, et al.
Published: (2024)
Planning with Logical Graph-based Language Model for Instruction Generation
by: Zhang, Fan, et al.
Published: (2023)
by: Zhang, Fan, et al.
Published: (2023)
DRScaffold: Boosting Dense-Scene Reasoning in Lightweight Vision Language Models
by: Shi, Xinrui, et al.
Published: (2026)
by: Shi, Xinrui, et al.
Published: (2026)
METEOR: Multi-Encoder Collaborative Token Pruning for Efficient Vision Language Models
by: Liu, Yuchen, et al.
Published: (2025)
by: Liu, Yuchen, et al.
Published: (2025)
Journal of Ophthalmic & Vision Research
Published: (2009)
Published: (2009)
A Labeled Ophthalmic Ultrasound Dataset with Medical Report Generation Based on Cross-modal Deep Learning
by: Wang, Jing, et al.
Published: (2024)
by: Wang, Jing, et al.
Published: (2024)
TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types
by: Chen, Jiankang, et al.
Published: (2025)
by: Chen, Jiankang, et al.
Published: (2025)
Distinct spatiotemporal patterns of juxtacortical microstructure in Alzheimer's Disease
by: Binyin Li, et al.
Published: (2025)
by: Binyin Li, et al.
Published: (2025)
Multi-Cache Enhanced Prototype Learning for Test-Time Generalization of Vision-Language Models
by: Chen, Xinyu, et al.
Published: (2025)
by: Chen, Xinyu, et al.
Published: (2025)
SCRWKV: Ultra-Compact Structure-Calibrated Vision-RWKV for Topological Crack Segmentation
by: Zhang, Hanxu, et al.
Published: (2026)
by: Zhang, Hanxu, et al.
Published: (2026)
WDMamba: When Wavelet Degradation Prior Meets Vision Mamba for Image Dehazing
by: Sun, Jie, et al.
Published: (2025)
by: Sun, Jie, et al.
Published: (2025)
Complex Neutrosophic α-Discounting Method for Multi-Criteria Risk Assessment under Periodic and Indeterminate Comprehensive Agro-Meteorological Hazards
by: Qiang Li, et al.
Published: (2025)
by: Qiang Li, et al.
Published: (2025)
Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model
by: Liu, Ting, et al.
Published: (2024)
by: Liu, Ting, et al.
Published: (2024)
LaMOT: Language-Guided Multi-Object Tracking
by: Li, Yunhao, et al.
Published: (2024)
by: Li, Yunhao, et al.
Published: (2024)
IBGP: Imperfect Byzantine Generals Problem for Zero-Shot Robustness in Communicative Multi-Agent Systems
by: Mao, Yihuan, et al.
Published: (2024)
by: Mao, Yihuan, et al.
Published: (2024)
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
by: Zhang, Chunhui, et al.
Published: (2023)
by: Zhang, Chunhui, et al.
Published: (2023)
InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
by: Yan, Yu, et al.
Published: (2024)
by: Yan, Yu, et al.
Published: (2024)
Exact CHY Integrand Construction Using Combinatorial Neural Networks and Discrete Optimization
by: Li, Simeng, et al.
Published: (2025)
by: Li, Simeng, et al.
Published: (2025)
QVLA: Not All Channels Are Equal in Vision-Language-Action Model's Quantization
by: Xu, Yuhao, et al.
Published: (2026)
by: Xu, Yuhao, et al.
Published: (2026)
Dynamics and Control of Vision-Aided Multi-UAV-tethered Netted System Capturing Non-Cooperative Target
by: Liu, Runhan, et al.
Published: (2025)
by: Liu, Runhan, et al.
Published: (2025)
Saliency-Aware Multi-Route Thinking: Revisiting Vision-Language Reasoning
by: Shi, Mingjia, et al.
Published: (2026)
by: Shi, Mingjia, et al.
Published: (2026)
SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in Structures
by: Liu, Hui, et al.
Published: (2025)
by: Liu, Hui, et al.
Published: (2025)
Prismer: A Vision-Language Model with Multi-Task Experts
by: Liu, Shikun, et al.
Published: (2023)
by: Liu, Shikun, et al.
Published: (2023)
ScaleKD: Strong Vision Transformers Could Be Excellent Teachers
by: Fan, Jiawei, et al.
Published: (2024)
by: Fan, Jiawei, et al.
Published: (2024)
Point-It-Out: Benchmarking Embodied Reasoning for Vision Language Models in Multi-Stage Visual Grounding
by: Xue, Haotian, et al.
Published: (2025)
by: Xue, Haotian, et al.
Published: (2025)
Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts
by: Chen, Yingfa, et al.
Published: (2024)
by: Chen, Yingfa, et al.
Published: (2024)
From Street View to Visual Network: Mapping the Visibility of Urban Landmarks with Vision-Language Models
by: Fan, Zicheng, et al.
Published: (2025)
by: Fan, Zicheng, et al.
Published: (2025)
Research on Constructing a Competency Model for Ophthalmic Nurses Based on Delphi Method
by: Sumei Liu, et al.
Published: (2025)
by: Sumei Liu, et al.
Published: (2025)
SGW-based Multi-Task Learning in Vision Tasks
by: Zhang, Ruiyuan, et al.
Published: (2024)
by: Zhang, Ruiyuan, et al.
Published: (2024)
MXene Synthesis and Carbon Capture Applications: Mini‐Review
by: Xinxing Li, et al.
Published: (2024)
by: Xinxing Li, et al.
Published: (2024)
Summit Vitals: Multi-Camera and Multi-Signal Biosensing at High Altitudes
by: Liu, Ke, et al.
Published: (2024)
by: Liu, Ke, et al.
Published: (2024)
ScratchEval : A Multimodal Evaluation Framework for LLMs in Block-Based Programming
by: Si, Yuan, et al.
Published: (2026)
by: Si, Yuan, et al.
Published: (2026)
Native Intelligence Emerges from Large-Scale Clinical Practice: A Retinal Foundation Model with Deployment Efficiency
by: Guo, Jia, et al.
Published: (2025)
by: Guo, Jia, et al.
Published: (2025)
Interface Electron Transfer Direction‐Tuned Urea Electrooxidation Over Multi‐Interface Nickel Sulfide Heterojunctions
by: Xingyu Guo, et al.
Published: (2024)
by: Xingyu Guo, et al.
Published: (2024)
Solvent Engineering for Transparent Dispersion of Large Fluoride Scintillating Particles
by: Jiatao Li, et al.
Published: (2026)
by: Jiatao Li, et al.
Published: (2026)
Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image Classification
by: Ye, Yaoqin, et al.
Published: (2024)
by: Ye, Yaoqin, et al.
Published: (2024)
Tert‐butyl hydroperoxide induces trabecular meshwork cells injury through ferroptotic cell death
by: Xuejing Yan, et al.
Published: (2024)
by: Xuejing Yan, et al.
Published: (2024)
Similar Items
-
VisionCLIP: An Med-AIGC based Ethical Language-Image Foundation Model for Generalizable Retina Image Analysis
by: Wei, Hao, et al.
Published: (2024) -
ViLReF: An Expert Knowledge Enabled Vision-Language Retinal Foundation Model
by: Yang, Shengzhu, et al.
Published: (2024) -
RetSTA: An LLM-Based Approach for Standardizing Clinical Fundus Image Reports
by: Cai, Jiushen, et al.
Published: (2025) -
RET-CLIP: A Retinal Image Foundation Model Pre-trained with Clinical Diagnostic Reports
by: Du, Jiawei, et al.
Published: (2024) -
Planning with Logical Graph-based Language Model for Instruction Generation
by: Zhang, Fan, et al.
Published: (2023)