:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yu, Jieming, Feng, Qiuxiao, Wang, Zhuohan, Ma, Xiaochen
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2604.16083
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Specialized Foundation Models Struggle to Beat Supervised Baselines
by: Xu, Zongzhe, et al.
Published: (2024)

Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling
by: Seo, Minseok, et al.
Published: (2025)

From SAM to DINOv2: Towards Distilling Foundation Models to Lightweight Baselines for Generalized Polyp Segmentation
by: Agnihotri, Shivanshu, et al.
Published: (2025)

Brought a Gun to a Knife Fight: Modern VFM Baselines Outgun Specialized Detectors on In-the-Wild AI Image Detection
by: Zhou, Yue, et al.
Published: (2025)

ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization
by: Du, Bo, et al.
Published: (2025)

Baseline Method of the Foundation Model Challenge for Ultrasound Image Analysis
by: Deng, Bo, et al.
Published: (2026)

Learning to Adapt Foundation Model DINOv2 for Capsule Endoscopy Diagnosis
by: Zhang, Bowen, et al.
Published: (2024)

DINOv3
by: Siméoni, Oriane, et al.
Published: (2025)

FIND: A Simple yet Effective Baseline for Diffusion-Generated Image Detection
by: Li, Jie, et al.
Published: (2026)

Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation
by: Ma, Qinghe, et al.
Published: (2025)

A Simple Baseline with Single-encoder for Referring Image Segmentation
by: Yu, Seonghoon, et al.
Published: (2024)

SimpleGVR: A Simple Baseline for Latent-Cascaded Video Super-Resolution
by: Xie, Liangbin, et al.
Published: (2025)

DINOv3 Visual Representations for Blueberry Perception Toward Robotic Harvesting
by: Wang, Rui-Feng, et al.
Published: (2026)

Community Forensics: Using Thousands of Generators to Train Fake Image Detectors
by: Park, Jeongsoo, et al.
Published: (2024)

MambaIR: A Simple Baseline for Image Restoration with State-Space Model
by: Guo, Hang, et al.
Published: (2024)

MM-DINOv2: Adapting Foundation Models for Multi-Modal Medical Image Analysis
by: Scholz, Daniel, et al.
Published: (2025)

A Simple and Better Baseline for Visual Grounding
by: Wang, Jingchao, et al.
Published: (2025)

AD-DINOv3: Enhancing DINOv3 for Zero-Shot Anomaly Detection with Anomaly-Aware Calibration
by: Yuan, Jingyi, et al.
Published: (2025)

Real-Time Object Detection Meets DINOv3
by: Huang, Shihua, et al.
Published: (2025)

Revisiting Birds Eye View Perception Models with Frozen Foundation Models: DINOv2 and Metric3Dv2
by: Hayes, Seamie, et al.
Published: (2025)

Rethinking Cross-Generator Image Forgery Detection through DINOv3
by: Huang, Zhenglin, et al.
Published: (2025)

DINOv3 with Test-Time Training for Medical Image Registration
by: Wang, Shansong, et al.
Published: (2025)

CubeFormer: A Simple yet Effective Baseline for Lightweight Image Super-Resolution
by: Wang, Jikai, et al.
Published: (2024)

Optimizing DINOv2 with Registers for Face Anti-Spoofing
by: Feng, Mika, et al.
Published: (2025)

SimToken: A Simple Baseline for Referring Audio-Visual Segmentation
by: Jin, Dian, et al.
Published: (2025)

Mamba YOLO: A Simple Baseline for Object Detection with State Space Model
by: Wang, Zeyu, et al.
Published: (2024)

A Simple Aerial Detection Baseline of Multimodal Language Models
by: Li, Qingyun, et al.
Published: (2025)

SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation
by: Ye, Jin, et al.
Published: (2024)

A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation
by: Wang, Zhengbo, et al.
Published: (2024)

Spatial Autoregressive Modeling of DINOv3 Embeddings for Unsupervised Anomaly Detection
by: Erdil, Ertunc, et al.
Published: (2026)

When Detectors Forget Forensics: Blocking Semantic Shortcuts for Generalizable AI-Generated Image Detection
by: Shuai, Chao, et al.
Published: (2026)

Evaluating General Purpose Vision Foundation Models for Medical Image Analysis: An Experimental Study of DINOv2 on Radiology Benchmarks
by: Baharoon, Mohammed, et al.
Published: (2023)

Does DINOv3 Set a New Medical Vision Standard? Benchmarking 2D and 3D Classification, Segmentation, and Registration
by: Liu, Che, et al.
Published: (2025)

On Calibration of Object Detectors: Pitfalls, Evaluation and Baselines
by: Kuzucu, Selim, et al.
Published: (2024)

BSED: Baseline Shapley-Based Explainable Detector
by: Kuroki, Michihiro, et al.
Published: (2023)

SINDER: Repairing the Singular Defects of DINOv2
by: Wang, Haoqi, et al.
Published: (2024)

INSID3: Training-Free In-Context Segmentation with DINOv3
by: Cuttano, Claudia, et al.
Published: (2026)

Finally Outshining the Random Baseline: A Simple and Effective Solution for Active Learning in 3D Biomedical Imaging
by: Lüth, Carsten T., et al.
Published: (2026)

HSDA: High-frequency Shuffle Data Augmentation for Bird's-Eye-View Map Segmentation
by: Glisson, Calvin, et al.
Published: (2024)

DINO-MVR: Multi-View Readout of Frozen DINOv3 for Annotation-Efficient Medical Segmentation
by: Jiang, Wei, et al.
Published: (2026)