Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Dean, Nathaniel, Sarkar, Dilip
Format:	Preprint
Publié:	2024
Sujets:	Machine Learning I.5.1
Accès en ligne:	https://arxiv.org/abs/2404.07083
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866913309869998080
author	Dean, Nathaniel Sarkar, Dilip
author_facet	Dean, Nathaniel Sarkar, Dilip
contents	Overparameterized deep neural networks (DNNs), if not sufficiently regularized, are susceptible to overfitting their training examples and not generalizing well to test data. To discourage overfitting, researchers have developed multicomponent loss functions that reduce intra-class feature correlation and maximize inter-class feature distance in one or more layers of the network. By analyzing the penultimate feature layer activations output by a DNN's feature extraction section prior to the linear classifier, we find that modified forms of the intra-class feature covariance and inter-class prototype separation are key components of a fundamental Chebyshev upper bound on the probability of misclassification, which we designate the Chebyshev Prototype Risk (CPR). While previous approaches' covariance loss terms scale quadratically with the number of network features, our CPR bound indicates that an approximate covariance loss in log-linear time is sufficient to reduce the bound and is scalable to large architectures. We implement the terms of the CPR bound into our Explicit CPR (exCPR) loss function and observe from empirical results on multiple datasets and network architectures that our training algorithm reduces overfitting and improves upon previous approaches in many settings. Our code is available at https://github.com/Deano1718/Regularization_exCPR .
format	Preprint
id	arxiv_https___arxiv_org_abs_2404_07083
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Minimizing Chebyshev Prototype Risk Magically Mitigates the Perils of Overfitting Dean, Nathaniel Sarkar, Dilip Machine Learning I.5.1 Overparameterized deep neural networks (DNNs), if not sufficiently regularized, are susceptible to overfitting their training examples and not generalizing well to test data. To discourage overfitting, researchers have developed multicomponent loss functions that reduce intra-class feature correlation and maximize inter-class feature distance in one or more layers of the network. By analyzing the penultimate feature layer activations output by a DNN's feature extraction section prior to the linear classifier, we find that modified forms of the intra-class feature covariance and inter-class prototype separation are key components of a fundamental Chebyshev upper bound on the probability of misclassification, which we designate the Chebyshev Prototype Risk (CPR). While previous approaches' covariance loss terms scale quadratically with the number of network features, our CPR bound indicates that an approximate covariance loss in log-linear time is sufficient to reduce the bound and is scalable to large architectures. We implement the terms of the CPR bound into our Explicit CPR (exCPR) loss function and observe from empirical results on multiple datasets and network architectures that our training algorithm reduces overfitting and improves upon previous approaches in many settings. Our code is available at https://github.com/Deano1718/Regularization_exCPR .
title	Minimizing Chebyshev Prototype Risk Magically Mitigates the Perils of Overfitting
topic	Machine Learning I.5.1
url	https://arxiv.org/abs/2404.07083

Documents similaires