Saved in:
Bibliographic Details
Main Authors: Demir, Samet, Dogan, Zafer
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.20250
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912912138829824
author Demir, Samet
Dogan, Zafer
author_facet Demir, Samet
Dogan, Zafer
contents Random feature models (RFMs), two-layer networks with a randomly initialized fixed first layer and a trained linear readout, are among the simplest nonlinear predictors. Prior asymptotic analyses in the proportional high-dimensional regime show that, under isotropic data, RFMs reduce to noisy linear models and offer no advantage over classical linear methods such as ridge regression. Yet RFMs frequently outperform linear baselines on structured real data. We show that this tension is explained by a correlation-driven phase transition: under spiked-covariance designs, the interaction between anisotropy and input-label correlation determines whether the RFM behaves as an effectively linear predictor or exhibits genuinely nonlinear gains. Concretely, we establish a universality principle under anisotropy and characterize the RFM generalization error via an equivalent noisy polynomial model. The effective degree of this polynomial, equivalently, which Hermite orders of the activation survive, is governed by the strength of input-label correlation, yielding an explicit boundary in the correlation-spike-magnitude plane. Below the boundary, the RFM collapses to a linear surrogate and can underperform strong linear baselines; above it, higher-order terms persist and the RFM achieves a clear nonlinear advantage. Numerical simulations and real-data experiments corroborate the theory and delineate the transition between these two regimes.
format Preprint
id arxiv_https___arxiv_org_abs_2409_20250
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Input-Label Correlation Governs a Linear-to-Nonlinear Transition in Random Features under Spiked Covariance
Demir, Samet
Dogan, Zafer
Machine Learning
Random feature models (RFMs), two-layer networks with a randomly initialized fixed first layer and a trained linear readout, are among the simplest nonlinear predictors. Prior asymptotic analyses in the proportional high-dimensional regime show that, under isotropic data, RFMs reduce to noisy linear models and offer no advantage over classical linear methods such as ridge regression. Yet RFMs frequently outperform linear baselines on structured real data. We show that this tension is explained by a correlation-driven phase transition: under spiked-covariance designs, the interaction between anisotropy and input-label correlation determines whether the RFM behaves as an effectively linear predictor or exhibits genuinely nonlinear gains. Concretely, we establish a universality principle under anisotropy and characterize the RFM generalization error via an equivalent noisy polynomial model. The effective degree of this polynomial, equivalently, which Hermite orders of the activation survive, is governed by the strength of input-label correlation, yielding an explicit boundary in the correlation-spike-magnitude plane. Below the boundary, the RFM collapses to a linear surrogate and can underperform strong linear baselines; above it, higher-order terms persist and the RFM achieves a clear nonlinear advantage. Numerical simulations and real-data experiments corroborate the theory and delineate the transition between these two regimes.
title Input-Label Correlation Governs a Linear-to-Nonlinear Transition in Random Features under Spiked Covariance
topic Machine Learning
url https://arxiv.org/abs/2409.20250