Saved in:
| Main Authors: | Xiong, Huimin, Meng, Zijie, Hu, Tianxiang, Zhou, Chenyi, Feng, Yang, Liu, Zuozhu |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.16781 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DentVLM: A Multimodal Vision-Language Model for Comprehensive Dental Diagnosis and Enhanced Clinical Practice
by: Meng, Zijie, et al.
Published: (2025)
by: Meng, Zijie, et al.
Published: (2025)
Detecting Dental Landmarks from Intraoral 3D Scans: the 3DTeethLand challenge
by: Ben-Hamadou, Achraf, et al.
Published: (2025)
by: Ben-Hamadou, Achraf, et al.
Published: (2025)
KPL: Training-Free Medical Knowledge Mining of Vision-Language Models
by: Liu, Jiaxiang, et al.
Published: (2025)
by: Liu, Jiaxiang, et al.
Published: (2025)
Teeth3DS+: An Extended Benchmark for Intraoral 3D Scans Analysis
by: Ben-Hamadou, Achraf, et al.
Published: (2022)
by: Ben-Hamadou, Achraf, et al.
Published: (2022)
3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks
by: Gai, Xiaotang, et al.
Published: (2025)
by: Gai, Xiaotang, et al.
Published: (2025)
MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale
by: Gai, Xiaotang, et al.
Published: (2024)
by: Gai, Xiaotang, et al.
Published: (2024)
Silhouette-to-Contour Registration: Aligning Intraoral Scan Models with Cephalometric Radiographs
by: Miao, Yiyi, et al.
Published: (2025)
by: Miao, Yiyi, et al.
Published: (2025)
Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding
by: Jiang, Songtao, et al.
Published: (2025)
by: Jiang, Songtao, et al.
Published: (2025)
DinoDental: Benchmarking DINOv3 as a Unified Vision Encoder for Dental Image Analysis
by: Tang, Kun, et al.
Published: (2026)
by: Tang, Kun, et al.
Published: (2026)
Dental3R: Geometry-Aware Pairing for Intraoral 3D Reconstruction from Sparse-View Photographs
by: Miao, Yiyi, et al.
Published: (2025)
by: Miao, Yiyi, et al.
Published: (2025)
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function
by: Zhuang, Chenyi, et al.
Published: (2024)
by: Zhuang, Chenyi, et al.
Published: (2024)
Quantized Prompt for Efficient Generalization of Vision-Language Models
by: Hao, Tianxiang, et al.
Published: (2024)
by: Hao, Tianxiang, et al.
Published: (2024)
Modest-Align: Data-Efficient Alignment for Vision-Language Models
by: Liu, Jiaxiang, et al.
Published: (2025)
by: Liu, Jiaxiang, et al.
Published: (2025)
Fair-MoE: Fairness-Oriented Mixture of Experts in Vision-Language Models
by: Wang, Peiran, et al.
Published: (2025)
by: Wang, Peiran, et al.
Published: (2025)
Evaluating the Suitability of Different Intraoral Scan Resolutions for Deep Learning-Based Tooth Segmentation
by: Weekley, Daron, et al.
Published: (2025)
by: Weekley, Daron, et al.
Published: (2025)
Modality-Fair Preference Optimization for Trustworthy MLLM Alignment
by: Jiang, Songtao, et al.
Published: (2024)
by: Jiang, Songtao, et al.
Published: (2024)
HSCR: Hierarchical Self-Contrastive Rewarding for Aligning Medical Vision Language Models
by: Jiang, Songtao, et al.
Published: (2025)
by: Jiang, Songtao, et al.
Published: (2025)
Understanding Degradation with Vision Language Model
by: Lan, Guanzhou, et al.
Published: (2026)
by: Lan, Guanzhou, et al.
Published: (2026)
SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model
by: Zhan, Yang, et al.
Published: (2024)
by: Zhan, Yang, et al.
Published: (2024)
Med-2E3: A 2D-Enhanced 3D Medical Multimodal Large Language Model
by: Shi, Yiming, et al.
Published: (2024)
by: Shi, Yiming, et al.
Published: (2024)
Advancing Lung Disease Diagnosis in 3D CT Scans
by: Li, Qingqiu, et al.
Published: (2025)
by: Li, Qingqiu, et al.
Published: (2025)
Med-GLIP: Advancing Medical Language-Image Pre-training with Large-scale Grounded Dataset
by: Deng, Ziye, et al.
Published: (2025)
by: Deng, Ziye, et al.
Published: (2025)
HICT: High-precision 3D CBCT reconstruction from a single X-ray
by: Ma, Wen, et al.
Published: (2026)
by: Ma, Wen, et al.
Published: (2026)
High-Fidelity 3D Tooth Reconstruction by Fusing Intraoral Scans and CBCT Data via a Deep Implicit Representation
by: Zhu, Yi, et al.
Published: (2026)
by: Zhu, Yi, et al.
Published: (2026)
Med3D-R1: Incentivizing Clinical Reasoning in 3D Medical Vision-Language Models for Abnormality Diagnosis
by: Lai, Haoran, et al.
Published: (2026)
by: Lai, Haoran, et al.
Published: (2026)
PX2Tooth: Reconstructing the 3D Point Cloud Teeth from a Single Panoramic X-ray
by: Ma, Wen, et al.
Published: (2024)
by: Ma, Wen, et al.
Published: (2024)
LT-Gaussian: Long-Term Map Update Using 3D Gaussian Splatting for Autonomous Driving
by: Cheng, Luqi, et al.
Published: (2025)
by: Cheng, Luqi, et al.
Published: (2025)
Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos
by: Zuo, Zhi, et al.
Published: (2025)
by: Zuo, Zhi, et al.
Published: (2025)
Delving into Out-of-Distribution Detection with Medical Vision-Language Models
by: Ju, Lie, et al.
Published: (2025)
by: Ju, Lie, et al.
Published: (2025)
Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
by: Jin, Yang, et al.
Published: (2023)
by: Jin, Yang, et al.
Published: (2023)
Are Vision Language Models Ready for Clinical Diagnosis? A 3D Medical Benchmark for Tumor-centric Visual Question Answering
by: Chen, Yixiong, et al.
Published: (2025)
by: Chen, Yixiong, et al.
Published: (2025)
GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding
by: Zhou, Yue, et al.
Published: (2024)
by: Zhou, Yue, et al.
Published: (2024)
GSPR: Multimodal Place Recognition Using 3D Gaussian Splatting for Autonomous Driving
by: Qi, Zhangshuo, et al.
Published: (2024)
by: Qi, Zhangshuo, et al.
Published: (2024)
A Unified Perspective on Adversarial Membership Manipulation in Vision Models
by: Gao, Ruize, et al.
Published: (2026)
by: Gao, Ruize, et al.
Published: (2026)
ARM3D: Attention-based relation module for indoor 3D object detection
by: Lan, Yuqing, et al.
Published: (2022)
by: Lan, Yuqing, et al.
Published: (2022)
Hyperbolic and Evidence-Prioritized Experts for Large Vision-Language Models
by: Zhou, Zijie, et al.
Published: (2026)
by: Zhou, Zijie, et al.
Published: (2026)
Unified Personalized Reward Model for Vision Generation
by: Wang, Yibin, et al.
Published: (2026)
by: Wang, Yibin, et al.
Published: (2026)
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models
by: Jiang, Songtao, et al.
Published: (2024)
by: Jiang, Songtao, et al.
Published: (2024)
CalliReader: Contextualizing Chinese Calligraphy via an Embedding-Aligned Vision-Language Model
by: Luo, Yuxuan, et al.
Published: (2025)
by: Luo, Yuxuan, et al.
Published: (2025)
From 2D CAD Drawings to 3D Parametric Models: A Vision-Language Approach
by: Wang, Xilin, et al.
Published: (2024)
by: Wang, Xilin, et al.
Published: (2024)
Similar Items
-
DentVLM: A Multimodal Vision-Language Model for Comprehensive Dental Diagnosis and Enhanced Clinical Practice
by: Meng, Zijie, et al.
Published: (2025) -
Detecting Dental Landmarks from Intraoral 3D Scans: the 3DTeethLand challenge
by: Ben-Hamadou, Achraf, et al.
Published: (2025) -
KPL: Training-Free Medical Knowledge Mining of Vision-Language Models
by: Liu, Jiaxiang, et al.
Published: (2025) -
Teeth3DS+: An Extended Benchmark for Intraoral 3D Scans Analysis
by: Ben-Hamadou, Achraf, et al.
Published: (2022) -
3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks
by: Gai, Xiaotang, et al.
Published: (2025)