Saved in:
Bibliographic Details
Main Authors: Perera, David, Moura, Victor, Santos, Lais Isabelle Alves dos, Haddad, Michel F. C., Figueiredo, Flavio
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.21692
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914584992940032
author Perera, David
Moura, Victor
Santos, Lais Isabelle Alves dos
Haddad, Michel F. C.
Figueiredo, Flavio
author_facet Perera, David
Moura, Victor
Santos, Lais Isabelle Alves dos
Haddad, Michel F. C.
Figueiredo, Flavio
contents Characterizing precisely the asymptotic generalization error of neural networks using parameters that can be estimated efficiently is a crucial problem in machine learning, which relies heavily on heuristics and practitioners' intuition to make key design choices. In order to mitigate this issue, we introduce the Representation Gap, a metric closely related to the generalization error, but admitting better-behaved asymptotic dynamics. Focusing on equivariant diffusion models and leveraging results from optimal quantization and point-process theory, we derive a precise asymptotic equivalent of the Representation Gap and show that it is governed by a single parameter, the \textit{intrinsic dimension} of the task, which is easy to interpret, efficient to estimate, and can be linked to the equivariances of common neural network architectures. We show that this asymptotic dynamic also extends to a broader range of tasks and training algorithms. Finally, we demonstrate empirically that our asymptotic law and intrinsic dimension estimation are accurate on a wide range of synthetic datasets, where these quantities are known, as well as on more realistic datasets, where we obtain results consistent with the related literature.
format Preprint
id arxiv_https___arxiv_org_abs_2605_21692
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Representation Gap: Explaining the Unreasonable Effectiveness of Neural Networks from a Geometric Perspective
Perera, David
Moura, Victor
Santos, Lais Isabelle Alves dos
Haddad, Michel F. C.
Figueiredo, Flavio
Machine Learning
Characterizing precisely the asymptotic generalization error of neural networks using parameters that can be estimated efficiently is a crucial problem in machine learning, which relies heavily on heuristics and practitioners' intuition to make key design choices. In order to mitigate this issue, we introduce the Representation Gap, a metric closely related to the generalization error, but admitting better-behaved asymptotic dynamics. Focusing on equivariant diffusion models and leveraging results from optimal quantization and point-process theory, we derive a precise asymptotic equivalent of the Representation Gap and show that it is governed by a single parameter, the \textit{intrinsic dimension} of the task, which is easy to interpret, efficient to estimate, and can be linked to the equivariances of common neural network architectures. We show that this asymptotic dynamic also extends to a broader range of tasks and training algorithms. Finally, we demonstrate empirically that our asymptotic law and intrinsic dimension estimation are accurate on a wide range of synthetic datasets, where these quantities are known, as well as on more realistic datasets, where we obtain results consistent with the related literature.
title Representation Gap: Explaining the Unreasonable Effectiveness of Neural Networks from a Geometric Perspective
topic Machine Learning
url https://arxiv.org/abs/2605.21692