Saved in:
Bibliographic Details
Main Authors: Shao, Shitong, Shen, Zhiqiang, Gong, Linrui, Chen, Huanran, Dai, Xu
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.02012
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910317802422272
author Shao, Shitong
Shen, Zhiqiang
Gong, Linrui
Chen, Huanran
Dai, Xu
author_facet Shao, Shitong
Shen, Zhiqiang
Gong, Linrui
Chen, Huanran
Dai, Xu
contents In this paper, we propose a novel knowledge transfer framework that introduces continuous normalizing flows for progressive knowledge transformation and leverages multi-step sampling strategies to achieve precision knowledge transfer. We name this framework Knowledge Transfer with Flow Matching (FM-KT), which can be integrated with a metric-based distillation method with any form (\textit{e.g.} vanilla KD, DKD, PKD and DIST) and a meta-encoder with any available architecture (\textit{e.g.} CNN, MLP and Transformer). By introducing stochastic interpolants, FM-KD is readily amenable to arbitrary noise schedules (\textit{e.g.}, VP-ODE, VE-ODE, Rectified flow) for normalized flow path estimation. We theoretically demonstrate that the training objective of FM-KT is equivalent to minimizing the upper bound of the teacher feature map or logit negative log-likelihood. Besides, FM-KT can be viewed as a unique implicit ensemble method that leads to performance gains. By slightly modifying the FM-KT framework, FM-KT can also be transformed into an online distillation framework OFM-KT with desirable performance gains. Through extensive experiments on CIFAR-100, ImageNet-1k, and MS-COCO datasets, we empirically validate the scalability and state-of-the-art performance of our proposed methods among relevant comparison approaches.
format Preprint
id arxiv_https___arxiv_org_abs_2402_02012
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Precise Knowledge Transfer via Flow Matching
Shao, Shitong
Shen, Zhiqiang
Gong, Linrui
Chen, Huanran
Dai, Xu
Computer Vision and Pattern Recognition
In this paper, we propose a novel knowledge transfer framework that introduces continuous normalizing flows for progressive knowledge transformation and leverages multi-step sampling strategies to achieve precision knowledge transfer. We name this framework Knowledge Transfer with Flow Matching (FM-KT), which can be integrated with a metric-based distillation method with any form (\textit{e.g.} vanilla KD, DKD, PKD and DIST) and a meta-encoder with any available architecture (\textit{e.g.} CNN, MLP and Transformer). By introducing stochastic interpolants, FM-KD is readily amenable to arbitrary noise schedules (\textit{e.g.}, VP-ODE, VE-ODE, Rectified flow) for normalized flow path estimation. We theoretically demonstrate that the training objective of FM-KT is equivalent to minimizing the upper bound of the teacher feature map or logit negative log-likelihood. Besides, FM-KT can be viewed as a unique implicit ensemble method that leads to performance gains. By slightly modifying the FM-KT framework, FM-KT can also be transformed into an online distillation framework OFM-KT with desirable performance gains. Through extensive experiments on CIFAR-100, ImageNet-1k, and MS-COCO datasets, we empirically validate the scalability and state-of-the-art performance of our proposed methods among relevant comparison approaches.
title Precise Knowledge Transfer via Flow Matching
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2402.02012