:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Miao, Zheng, Hung, Tien-Chieh
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.19022
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Enabling clinical use of foundation models for computational pathology
by: Henriksen, Audun L, et al.
Published: (2026)

Generative deep learning for foundational video translation in ultrasound
by: Tomic, Nikolina, et al.
Published: (2025)

A multimodal vision foundation model for generalizable knee pathology
by: Yu, Kang, et al.
Published: (2026)

Driving scenario generation and evaluation using a structured layer representation and foundational models
by: Hubert, Arthur, et al.
Published: (2025)

Deep learning framework for crater detection and identification on the Moon and Mars
by: Ma, Yihan, et al.
Published: (2025)

Thinker: A vision-language foundation model for embodied intelligence
by: Pan, Baiyu, et al.
Published: (2026)

An interpretable imbalanced semi-supervised deep learning framework for improving differential diagnosis of skin diseases
by: Weng, Futian, et al.
Published: (2022)

Prompting with the human-touch: evaluating model-sensitivity of foundation models for musculoskeletal CT segmentation
by: Magg, Caroline, et al.
Published: (2026)

ActiveMark: on watermarking of visual foundation models via massive activations
by: Chistyakova, Anna, et al.
Published: (2025)

Attention-based multiple instance learning for predominant growth pattern prediction in lung adenocarcinoma wsi using foundation models
by: Perez-Herrera, Laura Valeria, et al.
Published: (2026)

QuarterMap: Efficient Post-Training Token Pruning for Visual State Space Models
by: Chi, Tien-Yu, et al.
Published: (2025)

Developing a foundation model for high-resolution remote sensing data of the Netherlands
by: Vermeeren, Paul, et al.
Published: (2026)

Near, far: Patch-ordering enhances vision foundation models' scene understanding
by: Pariza, Valentinos, et al.
Published: (2024)

Decompose the model: Mechanistic interpretability in image models with Generalized Integrated Gradients (GIG)
by: Kim, Yearim, et al.
Published: (2024)

Low-Field Magnetic Resonance Image Quality Enhancement using a Conditional Flow Matching Model
by: Nguyen, Huu Tien, et al.
Published: (2025)

Paving the way toward foundation models for irregular and unaligned Satellite Image Time Series
by: Dumeur, Iris, et al.
Published: (2024)

Toward explainable AI approaches for breast imaging: adapting foundation models to diverse populations
by: Cavalcante, Guilherme J., et al.
Published: (2025)

Benchmarking histopathology foundation models in a multi-center dataset for skin cancer subtyping
by: Meseguer, Pablo, et al.
Published: (2025)

EyeCLIP: A visual-language foundation model for multi-modal ophthalmic image analysis
by: Shi, Danli, et al.
Published: (2024)

Geospatial foundation models for image analysis: evaluating and enhancing NASA-IBM Prithvi's domain adaptability
by: Hsu, Chia-Yu, et al.
Published: (2024)

DA-SSL: self-supervised domain adaptor to leverage foundational models in turbt histopathology slides
by: Zhang, Haoyue, et al.
Published: (2025)

Enhancing Interpretability of Vertebrae Fracture Grading using Human-interpretable Prototypes
by: Sinhamahapatra, Poulami, et al.
Published: (2024)

An analysis of HOI: using a training-free method with multimodal visual foundation models when only the test set is available, without the training set
by: Ai, Chaoyi
Published: (2024)

Revisiting semi-supervised learning in the era of foundation models
by: Zhang, Ping, et al.
Published: (2025)

AttnMod: Attention-Based New Art Styles
by: Su, Shih-Chieh
Published: (2024)

PB-IAD: Utilizing multimodal foundation models for semantic industrial anomaly detection in dynamic manufacturing environments
by: Hofmann, Bernd, et al.
Published: (2025)

Training a high-performance retinal foundation model with half-the-data and 400 times less compute
by: Engelmann, Justin, et al.
Published: (2024)

V"Mean"ba: Visual State Space Models only need 1 hidden dimension
by: Chi, Tien-Yu, et al.
Published: (2024)

Training Deep Visual Networks Beyond Loss and Accuracy Through a Dynamical Systems Approach
by: La Quang, Hai, et al.
Published: (2026)

Cognitive resilience: Unraveling the proficiency of image-captioning models to interpret masked visual content
by: Du, Zhicheng, et al.
Published: (2024)

Leveraging AI multimodal geospatial foundation models for improved near-real-time flood mapping at a global scale
by: Tulbure, Mirela G., et al.
Published: (2025)

Grounding-Aware Token Pruning: Recovering from Drastic Performance Drops in Visual Grounding Caused by Pruning
by: Chien, Tzu-Chun, et al.
Published: (2025)

Full end-to-end diagnostic workflow automation of 3D OCT via foundation model-driven AI for retinal diseases
by: Zhang, Jinze, et al.
Published: (2026)

TasselNetV4: A vision foundation model for cross-scene, cross-scale, and cross-species plant counting
by: Hu, Xiaonan, et al.
Published: (2025)

Explaning with trees: interpreting CNNs using hierarchies
by: Rodrigues, Caroline Mazini, et al.
Published: (2024)

TCMM: Token Constraint and Multi-Scale Memory Bank of Contrastive Learning for Unsupervised Person Re-identification
by: Zhu, Zheng-An, et al.
Published: (2025)

Multi-Modal interpretable automatic video captioning
by: Hanna-Asaad, Antoine, et al.
Published: (2024)

ReynoldsFlow: Exquisite Flow Estimation via Reynolds Transport Theorem
by: Chen, Yu-Hsi, et al.
Published: (2025)

Robustness Evaluation of OCR-based Visual Document Understanding under Multi-Modal Adversarial Attacks
by: Tien, Dong Nguyen, et al.
Published: (2025)

Re-identification from histopathology images
by: Ganz, Jonathan, et al.
Published: (2024)