Enregistré dans:
Détails bibliographiques
Auteurs principaux: Pehle, Christian, Slotine, Jean-Jacques
Format: Preprint
Publié: 2026
Sujets:
Accès en ligne:https://arxiv.org/abs/2603.03610
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866910040229675008
author Pehle, Christian
Slotine, Jean-Jacques
author_facet Pehle, Christian
Slotine, Jean-Jacques
contents Understanding how systems built out of modular components can be jointly optimized is an important problem in biology, engineering, and machine learning. The backpropagation algorithm is one such solution and has been instrumental in the success of neural networks. Despite its empirical success, a strong theoretical understanding of it is lacking. Here, we combine tools from Riemannian geometry, optimal control theory, and theoretical physics to advance this understanding. We make three key contributions: First, we revisit the derivation of backpropagation as a constrained optimization problem and combine it with the insight that Riemannian gradient descent trajectories can be understood as the minimum of an action. Second, we introduce a recursively defined layerwise Riemannian metric that exploits the modular structure of neural networks and can be efficiently computed using the Woodbury matrix identity, avoiding the $O(n^3)$ cost of full metric inversion. Third, we develop a framework of composable ``Riemannian modules'' whose convergence properties can be quantified using nonlinear contraction theory, providing algorithmic stability guarantees of order $O(κ^2 L/(ξμ\sqrt{n}))$ where $κ$ and $L$ are Lipschitz constants, $μ$ is the mass matrix scale, and $ξ$ bounds the condition number. Our layerwise metric approach provides a practical alternative to natural gradient descent. While we focus here on studying neural networks, our approach more generally applies to the study of systems made of modules that are optimized over time, as it occurs in biology during both evolution and development.
format Preprint
id arxiv_https___arxiv_org_abs_2603_03610
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Riemannian Optimization in Modular Systems
Pehle, Christian
Slotine, Jean-Jacques
Machine Learning
Understanding how systems built out of modular components can be jointly optimized is an important problem in biology, engineering, and machine learning. The backpropagation algorithm is one such solution and has been instrumental in the success of neural networks. Despite its empirical success, a strong theoretical understanding of it is lacking. Here, we combine tools from Riemannian geometry, optimal control theory, and theoretical physics to advance this understanding. We make three key contributions: First, we revisit the derivation of backpropagation as a constrained optimization problem and combine it with the insight that Riemannian gradient descent trajectories can be understood as the minimum of an action. Second, we introduce a recursively defined layerwise Riemannian metric that exploits the modular structure of neural networks and can be efficiently computed using the Woodbury matrix identity, avoiding the $O(n^3)$ cost of full metric inversion. Third, we develop a framework of composable ``Riemannian modules'' whose convergence properties can be quantified using nonlinear contraction theory, providing algorithmic stability guarantees of order $O(κ^2 L/(ξμ\sqrt{n}))$ where $κ$ and $L$ are Lipschitz constants, $μ$ is the mass matrix scale, and $ξ$ bounds the condition number. Our layerwise metric approach provides a practical alternative to natural gradient descent. While we focus here on studying neural networks, our approach more generally applies to the study of systems made of modules that are optimized over time, as it occurs in biology during both evolution and development.
title Riemannian Optimization in Modular Systems
topic Machine Learning
url https://arxiv.org/abs/2603.03610