Saved in:
Bibliographic Details
Main Authors: Yang, Fan, Zhang, Hongyang R., Wu, Sen, Ré, Christopher, Su, Weijie J.
Format: Preprint
Published: 2020
Subjects:
Online Access:https://arxiv.org/abs/2010.11750
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913886064607232
author Yang, Fan
Zhang, Hongyang R.
Wu, Sen
Ré, Christopher
Su, Weijie J.
author_facet Yang, Fan
Zhang, Hongyang R.
Wu, Sen
Ré, Christopher
Su, Weijie J.
contents The problem of learning one task using samples from another task is central to transfer learning. In this paper, we focus on answering the following question: when does combining the samples from two related tasks perform better than learning with one target task alone? This question is motivated by an empirical phenomenon known as negative transfer, which has been observed in practice. While the transfer effect from one task to another depends on factors such as their sample sizes and the spectrum of their covariance matrices, precisely quantifying this dependence has remained a challenging problem. In order to compare a transfer learning estimator to single-task learning, one needs to compare the risks between the two estimators precisely. Further, the comparison depends on the distribution shifts between the two tasks. This paper applies recent developments of random matrix theory to tackle this challenge in a high-dimensional linear regression setting with two tasks. We show precise high-dimensional asymptotics for the bias and variance of a classical hard parameter sharing (HPS) estimator in the proportional limit, where the sample sizes of both tasks increase proportionally with dimension at fixed ratios. The precise asymptotics apply to various types of distribution shifts, including covariate shifts, model shifts, and combinations of both. We illustrate these results in a random-effects model to mathematically prove a phase transition from positive to negative transfer as the number of source task samples increases. One insight from the analysis is that a rebalanced HPS estimator, which downsizes the source task when the model shift is high, achieves the minimax optimal rate. The finding regarding phase transition also applies to multiple tasks when covariates are shared across tasks. Simulations validate the accuracy of the high-dimensional asymptotics for finite dimensions.
format Preprint
id arxiv_https___arxiv_org_abs_2010_11750
institution arXiv
publishDate 2020
record_format arxiv
spellingShingle Precise High-Dimensional Asymptotics for Quantifying Heterogeneous Transfers
Yang, Fan
Zhang, Hongyang R.
Wu, Sen
Ré, Christopher
Su, Weijie J.
Machine Learning
The problem of learning one task using samples from another task is central to transfer learning. In this paper, we focus on answering the following question: when does combining the samples from two related tasks perform better than learning with one target task alone? This question is motivated by an empirical phenomenon known as negative transfer, which has been observed in practice. While the transfer effect from one task to another depends on factors such as their sample sizes and the spectrum of their covariance matrices, precisely quantifying this dependence has remained a challenging problem. In order to compare a transfer learning estimator to single-task learning, one needs to compare the risks between the two estimators precisely. Further, the comparison depends on the distribution shifts between the two tasks. This paper applies recent developments of random matrix theory to tackle this challenge in a high-dimensional linear regression setting with two tasks. We show precise high-dimensional asymptotics for the bias and variance of a classical hard parameter sharing (HPS) estimator in the proportional limit, where the sample sizes of both tasks increase proportionally with dimension at fixed ratios. The precise asymptotics apply to various types of distribution shifts, including covariate shifts, model shifts, and combinations of both. We illustrate these results in a random-effects model to mathematically prove a phase transition from positive to negative transfer as the number of source task samples increases. One insight from the analysis is that a rebalanced HPS estimator, which downsizes the source task when the model shift is high, achieves the minimax optimal rate. The finding regarding phase transition also applies to multiple tasks when covariates are shared across tasks. Simulations validate the accuracy of the high-dimensional asymptotics for finite dimensions.
title Precise High-Dimensional Asymptotics for Quantifying Heterogeneous Transfers
topic Machine Learning
url https://arxiv.org/abs/2010.11750