Saved in:
| Main Authors: | Wang, Zhao, Liu, Chang, Zhang, Shaoting, Dou, Qi |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2306.16741 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Large-scale Self-supervised Video Foundation Model for Intelligent Surgery
by: Yang, Shu, et al.
Published: (2025)
by: Yang, Shu, et al.
Published: (2025)
Structure-aware World Model for Probe Guidance via Large-scale Self-supervised Pre-train
by: Jiang, Haojun, et al.
Published: (2024)
by: Jiang, Haojun, et al.
Published: (2024)
NimbleD: Enhancing Self-supervised Monocular Depth Estimation with Pseudo-labels and Large-scale Video Pre-training
by: Luginov, Albert, et al.
Published: (2024)
by: Luginov, Albert, et al.
Published: (2024)
Domain-Adaptive Pre-training of Self-Supervised Foundation Models for Medical Image Classification in Gastrointestinal Endoscopy
by: Roth, Marcel, et al.
Published: (2024)
by: Roth, Marcel, et al.
Published: (2024)
OpenPath: Open-Set Active Learning for Pathology Image Classification via Pre-trained Vision-Language Models
by: Zhong, Lanfeng, et al.
Published: (2025)
by: Zhong, Lanfeng, et al.
Published: (2025)
Efficient Transferability Assessment for Selection of Pre-trained Detectors
by: Wang, Zhao, et al.
Published: (2024)
by: Wang, Zhao, et al.
Published: (2024)
SPAST: Arbitrary Style Transfer with Style Priors via Pre-trained Large-scale Model
by: Zhang, Zhanjie, et al.
Published: (2025)
by: Zhang, Zhanjie, et al.
Published: (2025)
Training-free Video Temporal Grounding using Large-scale Pre-trained Models
by: Zheng, Minghang, et al.
Published: (2024)
by: Zheng, Minghang, et al.
Published: (2024)
Large-scale Pre-training for Grounded Video Caption Generation
by: Kazakos, Evangelos, et al.
Published: (2025)
by: Kazakos, Evangelos, et al.
Published: (2025)
EndoMamba: An Efficient Foundation Model for Endoscopic Videos via Hierarchical Pre-training
by: Tian, Qingyao, et al.
Published: (2025)
by: Tian, Qingyao, et al.
Published: (2025)
Multi-modal Vision Pre-training for Medical Image Analysis
by: Rui, Shaohao, et al.
Published: (2024)
by: Rui, Shaohao, et al.
Published: (2024)
E-RayZer: Self-supervised 3D Reconstruction as Spatial Visual Pre-training
by: Zhao, Qitao, et al.
Published: (2025)
by: Zhao, Qitao, et al.
Published: (2025)
Advancing Video Self-Supervised Learning via Image Foundation Models
by: Wu, Jingwei, et al.
Published: (2025)
by: Wu, Jingwei, et al.
Published: (2025)
Parameter-efficient Tuning of Large-scale Multimodal Foundation Model
by: Wang, Haixin, et al.
Published: (2023)
by: Wang, Haixin, et al.
Published: (2023)
MedCAL-Bench: A Comprehensive Benchmark on Cold-Start Active Learning with Foundation Models for Medical Image Analysis
by: Zhu, Ning, et al.
Published: (2025)
by: Zhu, Ning, et al.
Published: (2025)
EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning
by: Yao, Jingfeng, et al.
Published: (2024)
by: Yao, Jingfeng, et al.
Published: (2024)
Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
by: Lin, Sihao, et al.
Published: (2025)
by: Lin, Sihao, et al.
Published: (2025)
Self-supervised Pre-training of Text Recognizers
by: Kišš, Martin, et al.
Published: (2024)
by: Kišš, Martin, et al.
Published: (2024)
Revealing Latent Information: A Physics-inspired Self-supervised Pre-training Framework for Noisy and Sparse Events
by: Zhu, Lin, et al.
Published: (2025)
by: Zhu, Lin, et al.
Published: (2025)
Towards Data-Efficient Video Pre-training with Frozen Image Foundation Models
by: Orlova, Svetlana, et al.
Published: (2026)
by: Orlova, Svetlana, et al.
Published: (2026)
Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding
by: Shui, Zhongyi, et al.
Published: (2025)
by: Shui, Zhongyi, et al.
Published: (2025)
Learning to Adapt Foundation Model DINOv2 for Capsule Endoscopy Diagnosis
by: Zhang, Bowen, et al.
Published: (2024)
by: Zhang, Bowen, et al.
Published: (2024)
MVP: Enhancing Video Large Language Models via Self-supervised Masked Video Prediction
by: Sun, Xiaokun, et al.
Published: (2026)
by: Sun, Xiaokun, et al.
Published: (2026)
Temporal-Consistent Video Restoration with Pre-trained Diffusion Models
by: Wang, Hengkang, et al.
Published: (2025)
by: Wang, Hengkang, et al.
Published: (2025)
Endora: Video Generation Models as Endoscopy Simulators
by: Li, Chenxin, et al.
Published: (2024)
by: Li, Chenxin, et al.
Published: (2024)
SMFormer: Empowering Self-supervised Stereo Matching via Foundation Models and Data Augmentation
by: Wang, Yun, et al.
Published: (2026)
by: Wang, Yun, et al.
Published: (2026)
An Explainable Biomedical Foundation Model via Large-Scale Concept-Enhanced Vision-Language Pre-training
by: Nie, Yuxiang, et al.
Published: (2025)
by: Nie, Yuxiang, et al.
Published: (2025)
Temporal Overlapping Prediction: A Self-supervised Pre-training Method for LiDAR Moving Object Segmentation
by: Miao, Ziliang, et al.
Published: (2025)
by: Miao, Ziliang, et al.
Published: (2025)
PersonViT: Large-scale Self-supervised Vision Transformer for Person Re-Identification
by: Hu, Bin, et al.
Published: (2024)
by: Hu, Bin, et al.
Published: (2024)
Fairness Analysis of CLIP-Based Foundation Models for X-Ray Image Classification
by: Sun, Xiangyu, et al.
Published: (2025)
by: Sun, Xiangyu, et al.
Published: (2025)
UniVid: Unifying Vision Tasks with Pre-trained Video Generation Models
by: Chen, Lan, et al.
Published: (2025)
by: Chen, Lan, et al.
Published: (2025)
MedDiff-FM: A Diffusion-based Foundation Model for Versatile Medical Image Applications
by: Yu, Yongrui, et al.
Published: (2024)
by: Yu, Yongrui, et al.
Published: (2024)
A Multimodal Pre-trained Network for Integrated EEG-Video Seizure Detection
by: Lu, Tong, et al.
Published: (2026)
by: Lu, Tong, et al.
Published: (2026)
RET-CLIP: A Retinal Image Foundation Model Pre-trained with Clinical Diagnostic Reports
by: Du, Jiawei, et al.
Published: (2024)
by: Du, Jiawei, et al.
Published: (2024)
InvCoSS: Inversion-driven Continual Self-supervised Learning in Medical Multi-modal Image Pre-training
by: Luo, Zihao, et al.
Published: (2025)
by: Luo, Zihao, et al.
Published: (2025)
Muskie: Multi-view Masked Image Modeling for 3D Vision Pre-training
by: Li, Wenyu, et al.
Published: (2025)
by: Li, Wenyu, et al.
Published: (2025)
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
by: Luo, Gen, et al.
Published: (2024)
by: Luo, Gen, et al.
Published: (2024)
UNIFORM: Unifying Knowledge from Large-scale and Diverse Pre-trained Models
by: Wang, Yimu, et al.
Published: (2025)
by: Wang, Yimu, et al.
Published: (2025)
Inter-slice Super-resolution of Magnetic Resonance Images by Pre-training and Self-supervised Fine-tuning
by: Wang, Xin, et al.
Published: (2024)
by: Wang, Xin, et al.
Published: (2024)
Pre-training Everywhere: Parameter-Efficient Fine-Tuning for Medical Image Analysis via Target Parameter Pre-training
by: Lei, Xingliang, et al.
Published: (2024)
by: Lei, Xingliang, et al.
Published: (2024)
Similar Items
-
Large-scale Self-supervised Video Foundation Model for Intelligent Surgery
by: Yang, Shu, et al.
Published: (2025) -
Structure-aware World Model for Probe Guidance via Large-scale Self-supervised Pre-train
by: Jiang, Haojun, et al.
Published: (2024) -
NimbleD: Enhancing Self-supervised Monocular Depth Estimation with Pseudo-labels and Large-scale Video Pre-training
by: Luginov, Albert, et al.
Published: (2024) -
Domain-Adaptive Pre-training of Self-Supervised Foundation Models for Medical Image Classification in Gastrointestinal Endoscopy
by: Roth, Marcel, et al.
Published: (2024) -
OpenPath: Open-Set Active Learning for Pathology Image Classification via Pre-trained Vision-Language Models
by: Zhong, Lanfeng, et al.
Published: (2025)