Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.07264 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Table of Contents:
- We study the generalization capability of nearly-interpolating linear regressors: $\boldsymbolβ$'s whose training error $τ$ is positive but small, i.e., below the noise floor. Under a random matrix theoretic assumption on the data distribution and an eigendecay assumption on the data covariance matrix $\boldsymbolΣ$, we demonstrate that any near-interpolator exhibits rapid norm growth: for $τ$ fixed, $\boldsymbolβ$ has squared $\ell_2$-norm $\mathbb{E}[\|{\boldsymbolβ}\|_{2}^{2}] = Ω(n^α)$ where $n$ is the number of samples and $α>1$ is the exponent of the eigendecay, i.e., $λ_i(\boldsymbolΣ) \sim i^{-α}$. This implies that existing data-independent norm-based bounds are necessarily loose. On the other hand, in the same regime we precisely characterize the asymptotic trade-off between interpolation and generalization. Our characterization reveals that larger norm scaling exponents $α$ correspond to worse trade-offs between interpolation and generalization. We verify empirically that a similar phenomenon holds for nearly-interpolating shallow neural networks.