Saved in:
Bibliographic Details
Main Author: Zhang, Junyu
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2401.03155
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • We investigate stochastic Bregman proximal gradient (SBPG) methods for minimizing a finite-sum nonconvex function $Ψ(x):=\frac{1}{n}\sum_{i=1}^nf_i(x)+ϕ(x)$, where $ϕ$ is convex and nonsmooth, while $f_i$, instead of gradient global Lipschitz continuity, satisfies a smooth-adaptability condition w.r.t. some kernel $h$. Standard acceleration techniques for stochastic algorithms (momentum, shuffling, variance reduction) depend on bounding stochastic errors by gradient differences that are further controlled via Lipschitz property. Lacking this, existing SBPG results are limited to vanilla stochastic approximation that cannot yield the optimal $O(\sqrt{n})$ complexity dependence on $n$. Moreover, existing works report complexities under various nonstandard stationarity measures that largely deviate from the standard minimal limiting Fréchet subdifferential $\mathrm{dist}(0,\partialΨ(\cdot))$. Our analysis reveals that these popular stationarity measures are often much smaller than $\mathrm{dist}(0,\partialΨ(\cdot))$, leading to overstated solution quality and non-stationary output. To resolve these issues, we design a new gradient mapping $\mathcal{D}_{ϕ,h}^λ(\cdot)$ by BPG residuals in dual space and a new kernel-conditioning (KC) regularity, under which the mismatch between $\|\mathcal{D}_{ϕ,h}^λ(\cdot)\|$ and $\mathrm{dist}(0,\partialΨ(\cdot))$ is provably $O(1)$ and instance-free. Moreover, KC-regularity guarantees Lipschitz-like bounds for gradient differences, providing general analysis tools for momentum, shuffling, and variance reduction under smooth-adaptability. We illustrate this point on variance reduced SBPG methods and establish an $O(\sqrt{n})$ complexity for $\|\mathcal{D}_{ϕ,h}^λ(\cdot)\|$, providing instance-free (worst-case) complexity under $\mathrm{dist}(0,\partialΨ(\cdot))$.