Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Karhadkar, Kedar, Sietsema, Alexander, Needell, Deanna, Montufar, Guido
Format: Preprint
Veröffentlicht: 2026
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2602.00825
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866917238959767552
author Karhadkar, Kedar
Sietsema, Alexander
Needell, Deanna
Montufar, Guido
author_facet Karhadkar, Kedar
Sietsema, Alexander
Needell, Deanna
Montufar, Guido
contents Motivated by recent work on benign overfitting in overparameterized machine learning, we study the generalization behavior of functions in Sobolev spaces $W^{k, p}(\mathbb{R}^d)$ that perfectly fit a noisy training data set. Under assumptions of label noise and sufficient regularity in the data distribution, we show that approximately norm-minimizing interpolators, which are canonical solutions selected by smoothness bias, exhibit harmful overfitting: even as the training sample size $n \to \infty$, the generalization error remains bounded below by a positive constant with high probability. Our results hold for arbitrary values of $p \in [1, \infty)$, in contrast to prior results studying the Hilbert space case ($p = 2$) using kernel methods. Our proof uses a geometric argument which identifies harmful neighborhoods of the training data using Sobolev inequalities.
format Preprint
id arxiv_https___arxiv_org_abs_2602_00825
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Harmful Overfitting in Sobolev Spaces
Karhadkar, Kedar
Sietsema, Alexander
Needell, Deanna
Montufar, Guido
Machine Learning
Motivated by recent work on benign overfitting in overparameterized machine learning, we study the generalization behavior of functions in Sobolev spaces $W^{k, p}(\mathbb{R}^d)$ that perfectly fit a noisy training data set. Under assumptions of label noise and sufficient regularity in the data distribution, we show that approximately norm-minimizing interpolators, which are canonical solutions selected by smoothness bias, exhibit harmful overfitting: even as the training sample size $n \to \infty$, the generalization error remains bounded below by a positive constant with high probability. Our results hold for arbitrary values of $p \in [1, \infty)$, in contrast to prior results studying the Hilbert space case ($p = 2$) using kernel methods. Our proof uses a geometric argument which identifies harmful neighborhoods of the training data using Sobolev inequalities.
title Harmful Overfitting in Sobolev Spaces
topic Machine Learning
url https://arxiv.org/abs/2602.00825