Enregistré dans:
Détails bibliographiques
Auteur principal: Montesuma, Eduardo Fernandes
Format: Preprint
Publié: 2025
Sujets:
Accès en ligne:https://arxiv.org/abs/2504.01757
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866914026861101056
author Montesuma, Eduardo Fernandes
author_facet Montesuma, Eduardo Fernandes
contents Knowledge Distillation (KD) seeks to transfer the knowledge of a teacher, towards a student neural net. This process is often done by matching the networks' predictions (i.e., their output), but, recently several works have proposed to match the distributions of neural nets' activations (i.e., their features), a process known as \emph{distribution matching}. In this paper, we propose an unifying framework, Knowledge Distillation through Distribution Matching (KD$^{2}$M), which formalizes this strategy. Our contributions are threefold. We i) provide an overview of distribution metrics used in distribution matching, ii) benchmark on computer vision datasets, and iii) derive new theoretical results for KD.
format Preprint
id arxiv_https___arxiv_org_abs_2504_01757
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle KD$^{2}$M: A unifying framework for feature knowledge distillation
Montesuma, Eduardo Fernandes
Machine Learning
Knowledge Distillation (KD) seeks to transfer the knowledge of a teacher, towards a student neural net. This process is often done by matching the networks' predictions (i.e., their output), but, recently several works have proposed to match the distributions of neural nets' activations (i.e., their features), a process known as \emph{distribution matching}. In this paper, we propose an unifying framework, Knowledge Distillation through Distribution Matching (KD$^{2}$M), which formalizes this strategy. Our contributions are threefold. We i) provide an overview of distribution metrics used in distribution matching, ii) benchmark on computer vision datasets, and iii) derive new theoretical results for KD.
title KD$^{2}$M: A unifying framework for feature knowledge distillation
topic Machine Learning
url https://arxiv.org/abs/2504.01757