Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Gupta, Kanan, Siegel, Jonathan W., Wojtowytsch, Stephan
Formato:	Preprint
Publicado:	2023
Materias:	Machine Learning Optimization and Control 68T07
Acceso en línea:	https://arxiv.org/abs/2302.05515
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866915001208406016
author	Gupta, Kanan Siegel, Jonathan W. Wojtowytsch, Stephan
author_facet	Gupta, Kanan Siegel, Jonathan W. Wojtowytsch, Stephan
contents	We present a generalization of Nesterov's accelerated gradient descent algorithm. Our algorithm (AGNES) provably achieves acceleration for smooth convex and strongly convex minimization tasks with noisy gradient estimates if the noise intensity is proportional to the magnitude of the gradient at every point. Nesterov's method converges at an accelerated rate if the constant of proportionality is below 1, while AGNES accommodates any signal-to-noise ratio. The noise model is motivated by applications in overparametrized machine learning. AGNES requires only two parameters in convex and three in strongly convex minimization tasks, improving on existing methods. We further provide clear geometric interpretations and heuristics for the choice of parameters.
format	Preprint
id	arxiv_https___arxiv_org_abs_2302_05515
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Nesterov acceleration despite very noisy gradients Gupta, Kanan Siegel, Jonathan W. Wojtowytsch, Stephan Machine Learning Optimization and Control 68T07 We present a generalization of Nesterov's accelerated gradient descent algorithm. Our algorithm (AGNES) provably achieves acceleration for smooth convex and strongly convex minimization tasks with noisy gradient estimates if the noise intensity is proportional to the magnitude of the gradient at every point. Nesterov's method converges at an accelerated rate if the constant of proportionality is below 1, while AGNES accommodates any signal-to-noise ratio. The noise model is motivated by applications in overparametrized machine learning. AGNES requires only two parameters in convex and three in strongly convex minimization tasks, improving on existing methods. We further provide clear geometric interpretations and heuristics for the choice of parameters.
title	Nesterov acceleration despite very noisy gradients
topic	Machine Learning Optimization and Control 68T07
url	https://arxiv.org/abs/2302.05515

Ejemplares similares