Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Gruhlke, Robert, Nouy, Anthony, Trunschke, Philipp
Format:	Preprint
Published:	2024
Subjects:	Optimization and Control Statistics Theory 49M15, 93E24, 68T07, 90C30, 60H30
Online Access:	https://arxiv.org/abs/2402.03113
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915762558468096
author	Gruhlke, Robert Nouy, Anthony Trunschke, Philipp
author_facet	Gruhlke, Robert Nouy, Anthony Trunschke, Philipp
contents	We consider the problem of optimising the expected value of a loss functional over a nonlinear model class of functions, assuming that we have only access to realisations of the gradient of the loss. This is a classical task in statistics, machine learning and physics-informed machine learning. A straightforward solution is to replace the exact objective with a Monte Carlo estimate before employing standard first-order methods like gradient descent, which yields the classical stochastic gradient descent method. But replacing the true objective with an estimate ensues a generalisation error. Rigorous bounds for this error typically require strong compactness and Lipschitz continuity assumptions while providing a very slow decay with sample size. To alleviate these issues, we propose a version of natural gradient descent that is based on optimal sampling methods. Under classical assumptions on the loss and the nonlinear model class, we prove that this scheme converges almost surely monotonically to a stationary point of the true objective. Under Polyak-Lojasiewicz-type conditions, this provides bounds for the generalisation error. As a remarkable result, we show that our stochastic optimisation scheme achieves the linear or exponential convergence rates of deterministic first order descent methods under suitable conditions.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_03113
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Optimal sampling for stochastic and natural gradient descent Gruhlke, Robert Nouy, Anthony Trunschke, Philipp Optimization and Control Statistics Theory 49M15, 93E24, 68T07, 90C30, 60H30 We consider the problem of optimising the expected value of a loss functional over a nonlinear model class of functions, assuming that we have only access to realisations of the gradient of the loss. This is a classical task in statistics, machine learning and physics-informed machine learning. A straightforward solution is to replace the exact objective with a Monte Carlo estimate before employing standard first-order methods like gradient descent, which yields the classical stochastic gradient descent method. But replacing the true objective with an estimate ensues a generalisation error. Rigorous bounds for this error typically require strong compactness and Lipschitz continuity assumptions while providing a very slow decay with sample size. To alleviate these issues, we propose a version of natural gradient descent that is based on optimal sampling methods. Under classical assumptions on the loss and the nonlinear model class, we prove that this scheme converges almost surely monotonically to a stationary point of the true objective. Under Polyak-Lojasiewicz-type conditions, this provides bounds for the generalisation error. As a remarkable result, we show that our stochastic optimisation scheme achieves the linear or exponential convergence rates of deterministic first order descent methods under suitable conditions.
title	Optimal sampling for stochastic and natural gradient descent
topic	Optimization and Control Statistics Theory 49M15, 93E24, 68T07, 90C30, 60H30
url	https://arxiv.org/abs/2402.03113

Similar Items