Gespeichert in:
| Hauptverfasser: | , , , , , |
|---|---|
| Format: | Preprint |
| Veröffentlicht: |
2026
|
| Schlagworte: | |
| Online-Zugang: | https://arxiv.org/abs/2602.05504 |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Inhaltsangabe:
- For first-order optimization of non-convex functions with Lipschitz continuous gradient and Hessian, the best known complexity for reaching an $\varepsilon$-approximation of a stationary point is $\mathcal{O}(\varepsilon^{-7/4})$. Existing algorithms achieving this bound are based on momentum, but are always complemented with safeguard mechanisms that erase the accumulated momentum if a certain condition is violated. Whether such resetting-momentum mechanisms are fundamentally necessary has remained an open question. We show that randomizing the parameters enable to achieve this complexity with high probability when using momentum, without any of such mechanisms. From an analysis perspective, we do so leveraging the continuized method, that interprets the algorithm as a realization of a continuous-time stochastic differential equation involving a Poisson process.