:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	He, Zhengxu, Li, Jun, Wu, Zhijian
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.15166
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Revisiting Cross-Architecture Distillation: Adaptive Dual-Teacher Transfer for Lightweight Video Models
by: Peng, Ying, et al.
Published: (2025)

AMMKD: Adaptive Multimodal Multi-teacher Distillation for Lightweight Vision-Language Models
by: Li, Yuqi, et al.
Published: (2025)

Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models
by: Yu, Yu-Chu, et al.
Published: (2024)

KPNDepth: Depth Estimation of Lane Images under Complex Rainy Environment
by: Shi, Zhengxu
Published: (2024)

Distributionally Robust Alignment for Medical Federated Vision-Language Pre-training Under Data Heterogeneity
by: Shuai, Zitao, et al.
Published: (2024)

Visual-Advantage On-Policy Distillation for Vision-Language Models
by: Liu, Ruiqi, et al.
Published: (2026)

MMT-ARD: Multimodal Multi-Teacher Adversarial Distillation for Robust Vision-Language Models
by: Li, Yuqi, et al.
Published: (2025)

DFMSD: Dual Feature Masking Stage-wise Knowledge Distillation for Object Detection
by: Zhang, Zhourui, et al.
Published: (2024)

Optimizing Parking Space Classification: Distilling Ensembles into Lightweight Classifiers
by: Alves, Paulo Luza, et al.
Published: (2024)

VOLD: Reasoning Transfer from LLMs to Vision-Language Models via On-Policy Distillation
by: Bousselham, Walid, et al.
Published: (2025)

Vision-Language Dataset Distillation
by: Wu, Xindi, et al.
Published: (2023)

Lightweight Model Pre-training via Language Guided Knowledge Distillation
by: Li, Mingsheng, et al.
Published: (2024)

Distilling 3D Spatial Reasoning into a Lightweight Vision-Language Model with CoT
by: Asfour, Alaa, et al.
Published: (2026)

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models
by: Lee, Byung-Kwan, et al.
Published: (2025)

Distilling Vision-Language Models on Millions of Videos
by: Zhao, Yue, et al.
Published: (2024)

Adaptive Teaching with Shared Classifier for Knowledge Distillation
by: Jang, Jaeyeon, et al.
Published: (2024)

Dataset Distillation via Vision-Language Category Prototype
by: Zou, Yawen, et al.
Published: (2025)

Shrinking the Teacher: An Adaptive Teaching Paradigm for Asymmetric EEG-Vision Alignment
by: Wu, Lukun, et al.
Published: (2025)

MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation
by: Zhu, Junyou, et al.
Published: (2024)

Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models
by: Zhang, Tiezheng, et al.
Published: (2025)

Enhancing Targeted Adversarial Attacks on Large Vision-Language Models via Intermediate Projector
by: Cao, Yiming, et al.
Published: (2025)

PromptKD: Unsupervised Prompt Distillation for Vision-Language Models
by: Li, Zheng, et al.
Published: (2024)

VPTracker: Global Vision-Language Tracking via Visual Prompt
by: Wang, Jingchao, et al.
Published: (2025)

Lightweight Vision Transformer with Bidirectional Interaction
by: Fan, Qihang, et al.
Published: (2023)

VGGDrive: Empowering Vision-Language Models with Cross-View Geometric Grounding for Autonomous Driving
by: Wang, Jie, et al.
Published: (2026)

ToVE: Efficient Vision-Language Learning via Knowledge Transfer from Vision Experts
by: Wu, Yuanchen, et al.
Published: (2025)

SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model
by: Cao, Bin, et al.
Published: (2024)

Towards Highly Transferable Vision-Language Attack via Semantic-Augmented Dynamic Contrastive Interaction
by: Li, Yuanbo, et al.
Published: (2026)

Long Context Transfer from Language to Vision
by: Zhang, Peiyuan, et al.
Published: (2024)

SDRT: Enhance Vision-Language Models by Self-Distillation with Diverse Reasoning Traces
by: Wu, Guande, et al.
Published: (2025)

PartDistill: 3D Shape Part Segmentation by Vision-Language Model Distillation
by: Umam, Ardian, et al.
Published: (2023)

RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image Dehazing
by: Zhou, Huiling, et al.
Published: (2024)

AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
by: Zhu, Yuhan, et al.
Published: (2024)

TSCM: A Teacher-Student Model for Vision Place Recognition Using Cross-Metric Knowledge Distillation
by: Shen, Yehui, et al.
Published: (2024)

Sample-wise Adaptive Weighting for Transfer Consistency in Adversarial Distillation
by: Lee, Hongsin, et al.
Published: (2025)

Unleashing Foundation Vision Models: Adaptive Transfer for Diverse Data-Limited Scientific Domains
by: Li, Qiankun, et al.
Published: (2025)

Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models
by: Hu, Yushi, et al.
Published: (2023)

VITA-VLA: Efficiently Teaching Vision-Language Models to Act via Action Expert Distillation
by: Dong, Shaoqi, et al.
Published: (2025)

Continual Distillation of Teachers from Different Domains
by: Michel, Nicolas, et al.
Published: (2026)

POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion
by: Liu, Yuan, et al.
Published: (2025)