Saved in:
Bibliographic Details
Main Authors: Liang, Tongtong, Singh, Esha, Parhi, Rahul, Cloninger, Alexander, Wang, Yu-Xiang
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.04807
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915985582194688
author Liang, Tongtong
Singh, Esha
Parhi, Rahul
Cloninger, Alexander
Wang, Yu-Xiang
author_facet Liang, Tongtong
Singh, Esha
Parhi, Rahul
Cloninger, Alexander
Wang, Yu-Xiang
contents Gradient descent on overparameterized neural networks typically operates at the Edge of Stability (EoS), where the largest Hessian eigenvalue hovers around a step-size-dependent threshold. We study how sparse connectivity changes generalization below this threshold in two-layer ReLU networks. Prior results have shown that for fully-connected networks (FCNs), generalization guarantees in this regime degrade and become vacuous on high-dimensional spherical inputs. Our analysis reveals that sparse connectivity fundamentally alters this picture. Under sparse connectivity, the network processes a collection of low-dimensional patches rather than the full input vector, so the effective constraint imposed by the stability condition is governed by the geometry of the training patch collection. We prove that when the receptive fields are small relative to the ambient dimension, the effective constraint yields non-vacuous generalization bounds in precisely the spherical regime where FCNs provably fail. The same framework also reveals a contrasting failure mode: if the patch collection lacks geometric structure, the constraint becomes unable to prevent overfitting. We corroborate this theory by analyzing the patch geometry of natural images, showing that standard convolutional designs produce patch multiset with low-dimensional structure that facilitates generalization. This provides a principled explanation for the generalization advantage of convolutional networks. Thus, our analysis yields a unified framework that identifies how architecture, data geometry, and gradient descent jointly govern generalization performance.
format Preprint
id arxiv_https___arxiv_org_abs_2603_04807
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Does Sparse Connectivity Improve Generalization? Convolutional Networks Below the Edge of Stability
Liang, Tongtong
Singh, Esha
Parhi, Rahul
Cloninger, Alexander
Wang, Yu-Xiang
Machine Learning
Gradient descent on overparameterized neural networks typically operates at the Edge of Stability (EoS), where the largest Hessian eigenvalue hovers around a step-size-dependent threshold. We study how sparse connectivity changes generalization below this threshold in two-layer ReLU networks. Prior results have shown that for fully-connected networks (FCNs), generalization guarantees in this regime degrade and become vacuous on high-dimensional spherical inputs. Our analysis reveals that sparse connectivity fundamentally alters this picture. Under sparse connectivity, the network processes a collection of low-dimensional patches rather than the full input vector, so the effective constraint imposed by the stability condition is governed by the geometry of the training patch collection. We prove that when the receptive fields are small relative to the ambient dimension, the effective constraint yields non-vacuous generalization bounds in precisely the spherical regime where FCNs provably fail. The same framework also reveals a contrasting failure mode: if the patch collection lacks geometric structure, the constraint becomes unable to prevent overfitting. We corroborate this theory by analyzing the patch geometry of natural images, showing that standard convolutional designs produce patch multiset with low-dimensional structure that facilitates generalization. This provides a principled explanation for the generalization advantage of convolutional networks. Thus, our analysis yields a unified framework that identifies how architecture, data geometry, and gradient descent jointly govern generalization performance.
title Does Sparse Connectivity Improve Generalization? Convolutional Networks Below the Edge of Stability
topic Machine Learning
url https://arxiv.org/abs/2603.04807