Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Chen, Yilong, Xu, Zongyi, Huang, Xiaoshui, Zhao, Shanshan, Jiang, Xinqi, Gao, Xinyu, Gao, Xinbo
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2409.02438
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909304229986304
author	Chen, Yilong Xu, Zongyi Huang, Xiaoshui Zhao, Shanshan Jiang, Xinqi Gao, Xinyu Gao, Xinbo
author_facet	Chen, Yilong Xu, Zongyi Huang, Xiaoshui Zhao, Shanshan Jiang, Xinqi Gao, Xinyu Gao, Xinbo
contents	Compared to single-modal knowledge distillation, cross-modal knowledge distillation faces more severe challenges due to domain gaps between modalities. Although various methods have proposed various solutions to overcome these challenges, there is still limited research on how domain gaps affect cross-modal knowledge distillation. This paper provides an in-depth analysis and evaluation of this issue. We first introduce the Non-Target Divergence Hypothesis (NTDH) to reveal the impact of domain gaps on cross-modal knowledge distillation. Our key finding is that domain gaps between modalities lead to distribution differences in non-target classes, and the smaller these differences, the better the performance of cross-modal knowledge distillation. Subsequently, based on Vapnik-Chervonenkis (VC) theory, we derive the upper and lower bounds of the approximation error for cross-modal knowledge distillation, thereby theoretically validating the NTDH. Finally, experiments on five cross-modal datasets further confirm the validity, generalisability, and applicability of the NTDH.
format	Preprint
id	arxiv_https___arxiv_org_abs_2409_02438
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Non-target Divergence Hypothesis: Toward Understanding Domain Gaps in Cross-Modal Knowledge Distillation Chen, Yilong Xu, Zongyi Huang, Xiaoshui Zhao, Shanshan Jiang, Xinqi Gao, Xinyu Gao, Xinbo Computer Vision and Pattern Recognition Compared to single-modal knowledge distillation, cross-modal knowledge distillation faces more severe challenges due to domain gaps between modalities. Although various methods have proposed various solutions to overcome these challenges, there is still limited research on how domain gaps affect cross-modal knowledge distillation. This paper provides an in-depth analysis and evaluation of this issue. We first introduce the Non-Target Divergence Hypothesis (NTDH) to reveal the impact of domain gaps on cross-modal knowledge distillation. Our key finding is that domain gaps between modalities lead to distribution differences in non-target classes, and the smaller these differences, the better the performance of cross-modal knowledge distillation. Subsequently, based on Vapnik-Chervonenkis (VC) theory, we derive the upper and lower bounds of the approximation error for cross-modal knowledge distillation, thereby theoretically validating the NTDH. Finally, experiments on five cross-modal datasets further confirm the validity, generalisability, and applicability of the NTDH.
title	Non-target Divergence Hypothesis: Toward Understanding Domain Gaps in Cross-Modal Knowledge Distillation
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2409.02438

Similar Items