Saved in:
Bibliographic Details
Main Authors: Liu, Yuanshi, Zhang, Haihan, Chen, Qian, Fang, Cong
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.09047
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912230966034432
author Liu, Yuanshi
Zhang, Haihan
Chen, Qian
Fang, Cong
author_facet Liu, Yuanshi
Zhang, Haihan
Chen, Qian
Fang, Cong
contents A common pursuit in modern statistical learning is to attain satisfactory generalization out of the source data distribution (OOD). In theory, the challenge remains unsolved even under the canonical setting of covariate shift for the linear model. This paper studies the foundational (high-dimensional) linear regression where the ground truth variables are confined to an ellipse-shape constraint and addresses two fundamental questions in this regime: (i) given the target covariate matrix, what is the min-max \emph{optimal} algorithm under covariate shift? (ii) for what kinds of target classes, the commonly-used SGD-type algorithms achieve optimality? Our analysis starts with establishing a tight lower generalization bound via a Bayesian Cramer-Rao inequality. For (i), we prove that the optimal estimator can be simply a certain linear transformation of the best estimator for the source distribution. Given the source and target matrices, we show that the transformation can be efficiently computed via a convex program. The min-max optimal analysis for SGD leverages the idea that we recognize both the accumulated updates of the applied algorithms and the ideal transformation as preconditions on the learning variables. We provide sufficient conditions when SGD with its acceleration variants attain optimality.
format Preprint
id arxiv_https___arxiv_org_abs_2502_09047
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Optimal Algorithms in Linear Regression under Covariate Shift: On the Importance of Precondition
Liu, Yuanshi
Zhang, Haihan
Chen, Qian
Fang, Cong
Machine Learning
A common pursuit in modern statistical learning is to attain satisfactory generalization out of the source data distribution (OOD). In theory, the challenge remains unsolved even under the canonical setting of covariate shift for the linear model. This paper studies the foundational (high-dimensional) linear regression where the ground truth variables are confined to an ellipse-shape constraint and addresses two fundamental questions in this regime: (i) given the target covariate matrix, what is the min-max \emph{optimal} algorithm under covariate shift? (ii) for what kinds of target classes, the commonly-used SGD-type algorithms achieve optimality? Our analysis starts with establishing a tight lower generalization bound via a Bayesian Cramer-Rao inequality. For (i), we prove that the optimal estimator can be simply a certain linear transformation of the best estimator for the source distribution. Given the source and target matrices, we show that the transformation can be efficiently computed via a convex program. The min-max optimal analysis for SGD leverages the idea that we recognize both the accumulated updates of the applied algorithms and the ideal transformation as preconditions on the learning variables. We provide sufficient conditions when SGD with its acceleration variants attain optimality.
title Optimal Algorithms in Linear Regression under Covariate Shift: On the Importance of Precondition
topic Machine Learning
url https://arxiv.org/abs/2502.09047