Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	He, Chang, Gao, Wenzhi, Jiang, Bo, Udell, Madeleine, Zhang, Shuzhong
Format:	Preprint
Published:	2025
Subjects:	Optimization and Control
Online Access:	https://arxiv.org/abs/2512.06231
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910044589654016
author	He, Chang Gao, Wenzhi Jiang, Bo Udell, Madeleine Zhang, Shuzhong
author_facet	He, Chang Gao, Wenzhi Jiang, Bo Udell, Madeleine Zhang, Shuzhong
contents	In this paper, we revisit a classical adaptive stepsize strategy for gradient descent: the Polyak stepsize (PolyakGD), originally proposed in Polyak (1969). We study the convergence behavior of PolyakGD from two perspectives: tight worst-case analysis and universality across function classes. As our first main result, we establish the tightness of the known convergence rates of PolyakGD by explicitly constructing worst-case functions. In particular, we show that the $O((1-\frac{1}κ)^K)$ rate for smooth strongly convex functions and the $O(1/K)$ rate for smooth convex functions are both tight. Moreover, we theoretically show that PolyakGD automatically exploits floating-point errors to escape the worst-case behavior. Our second main result provides new convergence guarantees for PolyakGD under both Hölder smoothness and Hölder growth conditions. These findings show that the Polyak stepsize is universal, automatically adapting to various function classes without requiring prior knowledge of problem parameters.
format	Preprint
id	arxiv_https___arxiv_org_abs_2512_06231
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	New Results on the Polyak Stepsize: Tight Convergence Analysis and Universal Function Classes He, Chang Gao, Wenzhi Jiang, Bo Udell, Madeleine Zhang, Shuzhong Optimization and Control In this paper, we revisit a classical adaptive stepsize strategy for gradient descent: the Polyak stepsize (PolyakGD), originally proposed in Polyak (1969). We study the convergence behavior of PolyakGD from two perspectives: tight worst-case analysis and universality across function classes. As our first main result, we establish the tightness of the known convergence rates of PolyakGD by explicitly constructing worst-case functions. In particular, we show that the $O((1-\frac{1}κ)^K)$ rate for smooth strongly convex functions and the $O(1/K)$ rate for smooth convex functions are both tight. Moreover, we theoretically show that PolyakGD automatically exploits floating-point errors to escape the worst-case behavior. Our second main result provides new convergence guarantees for PolyakGD under both Hölder smoothness and Hölder growth conditions. These findings show that the Polyak stepsize is universal, automatically adapting to various function classes without requiring prior knowledge of problem parameters.
title	New Results on the Polyak Stepsize: Tight Convergence Analysis and Universal Function Classes
topic	Optimization and Control
url	https://arxiv.org/abs/2512.06231

Similar Items