Saved in:
Bibliographic Details
Main Authors: Hermant, Julien, Aujol, Jean-François, Dossal, Charles, Huang, Lorick, Rondepierre, Aude, Waldspurger, Irène
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.05504
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910256230039552
author Hermant, Julien
Aujol, Jean-François
Dossal, Charles
Huang, Lorick
Rondepierre, Aude
Waldspurger, Irène
author_facet Hermant, Julien
Aujol, Jean-François
Dossal, Charles
Huang, Lorick
Rondepierre, Aude
Waldspurger, Irène
contents For first-order optimization of non-convex functions with Lipschitz continuous gradient and Hessian, the best known complexity for reaching an $\varepsilon$-approximation of a stationary point is $\mathcal{O}(\varepsilon^{-7/4})$. Existing algorithms achieving this bound are based on momentum, but are always complemented with safeguard mechanisms that erase the accumulated momentum if a certain condition is violated. Whether such resetting-momentum mechanisms are fundamentally necessary has remained an open question. We show that randomizing the parameters enable to achieve this complexity with high probability when using momentum, without any of such mechanisms. From an analysis perspective, we do so leveraging the continuized method, that interprets the algorithm as a realization of a continuous-time stochastic differential equation involving a Poisson process.
format Preprint
id arxiv_https___arxiv_org_abs_2602_05504
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Continuized Nesterov Momentum Achieves the $O(\varepsilon^{-7/4})$ Complexity without Additional Mechanisms
Hermant, Julien
Aujol, Jean-François
Dossal, Charles
Huang, Lorick
Rondepierre, Aude
Waldspurger, Irène
Optimization and Control
For first-order optimization of non-convex functions with Lipschitz continuous gradient and Hessian, the best known complexity for reaching an $\varepsilon$-approximation of a stationary point is $\mathcal{O}(\varepsilon^{-7/4})$. Existing algorithms achieving this bound are based on momentum, but are always complemented with safeguard mechanisms that erase the accumulated momentum if a certain condition is violated. Whether such resetting-momentum mechanisms are fundamentally necessary has remained an open question. We show that randomizing the parameters enable to achieve this complexity with high probability when using momentum, without any of such mechanisms. From an analysis perspective, we do so leveraging the continuized method, that interprets the algorithm as a realization of a continuous-time stochastic differential equation involving a Poisson process.
title Continuized Nesterov Momentum Achieves the $O(\varepsilon^{-7/4})$ Complexity without Additional Mechanisms
topic Optimization and Control
url https://arxiv.org/abs/2602.05504