Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteur principal:	Montesuma, Eduardo Fernandes
Format:	Preprint
Publié:	2025
Sujets:	Machine Learning
Accès en ligne:	https://arxiv.org/abs/2504.01757
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866914026861101056
author	Montesuma, Eduardo Fernandes
author_facet	Montesuma, Eduardo Fernandes
contents	Knowledge Distillation (KD) seeks to transfer the knowledge of a teacher, towards a student neural net. This process is often done by matching the networks' predictions (i.e., their output), but, recently several works have proposed to match the distributions of neural nets' activations (i.e., their features), a process known as \emph{distribution matching}. In this paper, we propose an unifying framework, Knowledge Distillation through Distribution Matching (KD$^{2}$M), which formalizes this strategy. Our contributions are threefold. We i) provide an overview of distribution metrics used in distribution matching, ii) benchmark on computer vision datasets, and iii) derive new theoretical results for KD.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_01757
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	KD$^{2}$M: A unifying framework for feature knowledge distillation Montesuma, Eduardo Fernandes Machine Learning Knowledge Distillation (KD) seeks to transfer the knowledge of a teacher, towards a student neural net. This process is often done by matching the networks' predictions (i.e., their output), but, recently several works have proposed to match the distributions of neural nets' activations (i.e., their features), a process known as \emph{distribution matching}. In this paper, we propose an unifying framework, Knowledge Distillation through Distribution Matching (KD$^{2}$M), which formalizes this strategy. Our contributions are threefold. We i) provide an overview of distribution metrics used in distribution matching, ii) benchmark on computer vision datasets, and iii) derive new theoretical results for KD.
title	KD$^{2}$M: A unifying framework for feature knowledge distillation
topic	Machine Learning
url	https://arxiv.org/abs/2504.01757

Documents similaires