Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Moshtaghifar, Mohammad, Rodomanov, Anton, Vankov, Daniil, Stich, Sebastian
Format:	Preprint
Published:	2025
Subjects:	Optimization and Control Machine Learning
Online Access:	https://arxiv.org/abs/2501.10258
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908980955054080
author	Moshtaghifar, Mohammad Rodomanov, Anton Vankov, Daniil Stich, Sebastian
author_facet	Moshtaghifar, Mohammad Rodomanov, Anton Vankov, Daniil Stich, Sebastian
contents	We present a novel universal gradient method for solving convex optimization problems. Our algorithm, Dual Averaging with Distance Adaptation (DADA), is based on the classical scheme of dual averaging and dynamically adjusts its coefficients based on observed gradients and the distance between iterates and the starting point, eliminating the need for problem-specific parameters. DADA is a universal algorithm that simultaneously works for a broad spectrum of problem classes, provided the local growth of the objective function around its minimizer can be bounded. Particular examples of such problem classes are nonsmooth Lipschitz functions, Lipschitz-smooth functions, Hölder-smooth functions, functions with high-order Lipschitz derivative, quasi-self-concordant functions, and $(L_0,L_1)$-smooth functions. Crucially, DADA is applicable to both unconstrained and constrained problems, even when the domain is unbounded, without requiring prior knowledge of the number of iterations or desired accuracy.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_10258
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	DADA: Dual Averaging with Distance Adaptation Moshtaghifar, Mohammad Rodomanov, Anton Vankov, Daniil Stich, Sebastian Optimization and Control Machine Learning We present a novel universal gradient method for solving convex optimization problems. Our algorithm, Dual Averaging with Distance Adaptation (DADA), is based on the classical scheme of dual averaging and dynamically adjusts its coefficients based on observed gradients and the distance between iterates and the starting point, eliminating the need for problem-specific parameters. DADA is a universal algorithm that simultaneously works for a broad spectrum of problem classes, provided the local growth of the objective function around its minimizer can be bounded. Particular examples of such problem classes are nonsmooth Lipschitz functions, Lipschitz-smooth functions, Hölder-smooth functions, functions with high-order Lipschitz derivative, quasi-self-concordant functions, and $(L_0,L_1)$-smooth functions. Crucially, DADA is applicable to both unconstrained and constrained problems, even when the domain is unbounded, without requiring prior knowledge of the number of iterations or desired accuracy.
title	DADA: Dual Averaging with Distance Adaptation
topic	Optimization and Control Machine Learning
url	https://arxiv.org/abs/2501.10258

Similar Items