Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Nagarajan, Vaishnavh, Andreassen, Anders, Neyshabur, Behnam
Format:	Preprint
Published:	2020
Subjects:	Machine Learning Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2010.15775
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913493027913728
author	Nagarajan, Vaishnavh Andreassen, Anders Neyshabur, Behnam
author_facet	Nagarajan, Vaishnavh Andreassen, Anders Neyshabur, Behnam
contents	Empirical studies suggest that machine learning models often rely on features, such as the background, that may be spuriously correlated with the label only during training time, resulting in poor accuracy during test-time. In this work, we identify the fundamental factors that give rise to this behavior, by explaining why models fail this way {\em even} in easy-to-learn tasks where one would expect these models to succeed. In particular, through a theoretical study of gradient-descent-trained linear classifiers on some easy-to-learn tasks, we uncover two complementary failure modes. These modes arise from how spurious correlations induce two kinds of skews in the data: one geometric in nature, and another, statistical in nature. Finally, we construct natural modifications of image classification datasets to understand when these failure modes can arise in practice. We also design experiments to isolate the two failure modes when training modern neural networks on these datasets.
format	Preprint
id	arxiv_https___arxiv_org_abs_2010_15775
institution	arXiv
publishDate	2020
record_format	arxiv
spellingShingle	Understanding the Failure Modes of Out-of-Distribution Generalization Nagarajan, Vaishnavh Andreassen, Anders Neyshabur, Behnam Machine Learning Computer Vision and Pattern Recognition Empirical studies suggest that machine learning models often rely on features, such as the background, that may be spuriously correlated with the label only during training time, resulting in poor accuracy during test-time. In this work, we identify the fundamental factors that give rise to this behavior, by explaining why models fail this way {\em even} in easy-to-learn tasks where one would expect these models to succeed. In particular, through a theoretical study of gradient-descent-trained linear classifiers on some easy-to-learn tasks, we uncover two complementary failure modes. These modes arise from how spurious correlations induce two kinds of skews in the data: one geometric in nature, and another, statistical in nature. Finally, we construct natural modifications of image classification datasets to understand when these failure modes can arise in practice. We also design experiments to isolate the two failure modes when training modern neural networks on these datasets.
title	Understanding the Failure Modes of Out-of-Distribution Generalization
topic	Machine Learning Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2010.15775

Similar Items