Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Carratino, Luigi, Cissé, Moustapha, Jenatton, Rodolphe, Vert, Jean-Philippe
Format:	Preprint
Published:	2020
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2006.06049
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910265104138240
author	Carratino, Luigi Cissé, Moustapha Jenatton, Rodolphe Vert, Jean-Philippe
author_facet	Carratino, Luigi Cissé, Moustapha Jenatton, Rodolphe Vert, Jean-Philippe
contents	Mixup is a data augmentation technique that creates new examples as convex combinations of training points and labels. This simple technique has empirically shown to improve the accuracy of many state-of-the-art models in different settings and applications, but the reasons behind this empirical success remain poorly understood. In this paper we take a substantial step in explaining the theoretical foundations of Mixup, by clarifying its regularization effects. We show that Mixup can be interpreted as standard empirical risk minimization estimator subject to a combination of data transformation and random perturbation of the transformed data. We gain two core insights from this new interpretation. First, the data transformation suggests that, at test time, a model trained with Mixup should also be applied to transformed data, a one-line change in code that we show empirically to improve both accuracy and calibration of the prediction. Second, we show how the random perturbation of the new interpretation of Mixup induces multiple known regularization schemes, including label smoothing and reduction of the Lipschitz constant of the estimator. These schemes interact synergistically with each other, resulting in a self calibrated and effective regularization effect that prevents overfitting and overconfident predictions. We corroborate our theoretical analysis with experiments that support our conclusions.
format	Preprint
id	arxiv_https___arxiv_org_abs_2006_06049
institution	arXiv
publishDate	2020
record_format	arxiv
spellingShingle	On Mixup Regularization Carratino, Luigi Cissé, Moustapha Jenatton, Rodolphe Vert, Jean-Philippe Machine Learning Mixup is a data augmentation technique that creates new examples as convex combinations of training points and labels. This simple technique has empirically shown to improve the accuracy of many state-of-the-art models in different settings and applications, but the reasons behind this empirical success remain poorly understood. In this paper we take a substantial step in explaining the theoretical foundations of Mixup, by clarifying its regularization effects. We show that Mixup can be interpreted as standard empirical risk minimization estimator subject to a combination of data transformation and random perturbation of the transformed data. We gain two core insights from this new interpretation. First, the data transformation suggests that, at test time, a model trained with Mixup should also be applied to transformed data, a one-line change in code that we show empirically to improve both accuracy and calibration of the prediction. Second, we show how the random perturbation of the new interpretation of Mixup induces multiple known regularization schemes, including label smoothing and reduction of the Lipschitz constant of the estimator. These schemes interact synergistically with each other, resulting in a self calibrated and effective regularization effect that prevents overfitting and overconfident predictions. We corroborate our theoretical analysis with experiments that support our conclusions.
title	On Mixup Regularization
topic	Machine Learning
url	https://arxiv.org/abs/2006.06049

Similar Items