Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Liang, Tongtong, Singh, Esha, Parhi, Rahul, Cloninger, Alexander, Wang, Yu-Xiang
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2603.04807
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915985582194688
author	Liang, Tongtong Singh, Esha Parhi, Rahul Cloninger, Alexander Wang, Yu-Xiang
author_facet	Liang, Tongtong Singh, Esha Parhi, Rahul Cloninger, Alexander Wang, Yu-Xiang
contents	Gradient descent on overparameterized neural networks typically operates at the Edge of Stability (EoS), where the largest Hessian eigenvalue hovers around a step-size-dependent threshold. We study how sparse connectivity changes generalization below this threshold in two-layer ReLU networks. Prior results have shown that for fully-connected networks (FCNs), generalization guarantees in this regime degrade and become vacuous on high-dimensional spherical inputs. Our analysis reveals that sparse connectivity fundamentally alters this picture. Under sparse connectivity, the network processes a collection of low-dimensional patches rather than the full input vector, so the effective constraint imposed by the stability condition is governed by the geometry of the training patch collection. We prove that when the receptive fields are small relative to the ambient dimension, the effective constraint yields non-vacuous generalization bounds in precisely the spherical regime where FCNs provably fail. The same framework also reveals a contrasting failure mode: if the patch collection lacks geometric structure, the constraint becomes unable to prevent overfitting. We corroborate this theory by analyzing the patch geometry of natural images, showing that standard convolutional designs produce patch multiset with low-dimensional structure that facilitates generalization. This provides a principled explanation for the generalization advantage of convolutional networks. Thus, our analysis yields a unified framework that identifies how architecture, data geometry, and gradient descent jointly govern generalization performance.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_04807
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Does Sparse Connectivity Improve Generalization? Convolutional Networks Below the Edge of Stability Liang, Tongtong Singh, Esha Parhi, Rahul Cloninger, Alexander Wang, Yu-Xiang Machine Learning Gradient descent on overparameterized neural networks typically operates at the Edge of Stability (EoS), where the largest Hessian eigenvalue hovers around a step-size-dependent threshold. We study how sparse connectivity changes generalization below this threshold in two-layer ReLU networks. Prior results have shown that for fully-connected networks (FCNs), generalization guarantees in this regime degrade and become vacuous on high-dimensional spherical inputs. Our analysis reveals that sparse connectivity fundamentally alters this picture. Under sparse connectivity, the network processes a collection of low-dimensional patches rather than the full input vector, so the effective constraint imposed by the stability condition is governed by the geometry of the training patch collection. We prove that when the receptive fields are small relative to the ambient dimension, the effective constraint yields non-vacuous generalization bounds in precisely the spherical regime where FCNs provably fail. The same framework also reveals a contrasting failure mode: if the patch collection lacks geometric structure, the constraint becomes unable to prevent overfitting. We corroborate this theory by analyzing the patch geometry of natural images, showing that standard convolutional designs produce patch multiset with low-dimensional structure that facilitates generalization. This provides a principled explanation for the generalization advantage of convolutional networks. Thus, our analysis yields a unified framework that identifies how architecture, data geometry, and gradient descent jointly govern generalization performance.
title	Does Sparse Connectivity Improve Generalization? Convolutional Networks Below the Edge of Stability
topic	Machine Learning
url	https://arxiv.org/abs/2603.04807

Similar Items