:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Song, Zhao, Yue, Song
Format:	Preprint
Published:	2023
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2305.08001
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Fast Inference with Kronecker-Sparse Matrices
by: Gonon, Antoine, et al.
Published: (2024)

Rethinking Bregman Divergences in Kronecker-Factored Optimizers
by: Liu, Bing, et al.
Published: (2026)

Quantum Speedup for Spectral Approximation of Kronecker Products
by: Gao, Yeqi, et al.
Published: (2024)

Scaling Laws of SignSGD in Linear Regression: When Does It Outperform SGD?
by: Kim, Jihwan, et al.
Published: (2026)

Does SGD really happen in tiny subspaces?
by: Song, Minhak, et al.
Published: (2024)

Topology-aware Generalization of Decentralized SGD
by: Zhu, Tongtian, et al.
Published: (2022)

Fast and Efficient Matching Algorithm with Deadline Instances
by: Song, Zhao, et al.
Published: (2023)

Generalization and Optimization of SGD with Lookahead
by: Li, Kangcheng, et al.
Published: (2025)

Kronecker-Structured Nonparametric Spatiotemporal Point Processes
by: Xu, Zhitong, et al.
Published: (2026)

Scalable Gaussian Processes with Latent Kronecker Structure
by: Lin, Jihao Andreas, et al.
Published: (2025)

Bootstrap SGD: Algorithmic Stability and Robustness
by: Christmann, Andreas, et al.
Published: (2024)

A Short Note on Batch-efficient Divide-and-Conquer Algorithm for EigenDecomposition
by: Song, Yue
Published: (2026)

The ADMM-PINNs Algorithmic Framework for Nonsmooth PDE-Constrained Optimization: A Deep Learning Approach
by: Song, Yongcun, et al.
Published: (2023)

Diagonalisation SGD: Fast & Convergent SGD for Non-Differentiable Models via Reparameterisation and Smoothing
by: Wagner, Dominik, et al.
Published: (2024)

Improved Stability and Generalization Guarantees of the Decentralized SGD Algorithm
by: Bars, Batiste Le, et al.
Published: (2023)

Non-Euclidean SGD for Structured Optimization: Unified Analysis and Improved Rates
by: Kovalev, Dmitry, et al.
Published: (2025)

Faster Algorithms for Structured Linear and Kernel Support Vector Machines
by: Gu, Yuzhou, et al.
Published: (2023)

Higher-Order Transformers With Kronecker-Structured Attention
by: Omranpour, Soroush, et al.
Published: (2024)

Truncated Non-Uniform Quantization for Distributed SGD
by: Yan, Guangfeng, et al.
Published: (2024)

SLowcal-SGD: Slow Query Points Improve Local-SGD for Stochastic Convex Optimization
by: Dahan, Tehila, et al.
Published: (2023)

Fast RoPE Attention: Combining the Polynomial Method and Fast Fourier Transform
by: Alman, Josh, et al.
Published: (2025)

Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime
by: Attia, Amit, et al.
Published: (2025)

Learning Kronecker-Structured Graphs from Smooth Signals
by: Shi, Changhao, et al.
Published: (2025)

SketchySGD: Reliable Stochastic Optimization via Randomized Curvature Estimates
by: Frangella, Zachary, et al.
Published: (2022)

Suspicious Alignment of SGD: A Fine-Grained Step Size Condition Analysis
by: Deng, Shenyang, et al.
Published: (2026)

The Optimization Landscape of SGD Across the Feature Learning Strength
by: Atanasov, Alexander, et al.
Published: (2024)

Stochastic Resetting Mitigates Latent Gradient Bias of SGD from Label Noise
by: Bae, Youngkyoung, et al.
Published: (2024)

Trustworthy Efficient Communication for Distributed Learning using LQ-SGD Algorithm
by: Li, Hongyang, et al.
Published: (2025)

Sign-SGD via Parameter-Free Optimization
by: Medyakov, Daniil, et al.
Published: (2025)

From PowerSGD to PowerSGD+: Low-Rank Gradient Compression for Distributed Optimization with Convergence Guarantees
by: Xie, Shengping, et al.
Published: (2025)

Minibatch and Local SGD: Algorithmic Stability and Linear Speedup in Generalization
by: Lei, Yunwen, et al.
Published: (2023)

Scaling Gaussian Processes for Learning Curve Prediction via Latent Kronecker Structure
by: Lin, Jihao Andreas, et al.
Published: (2024)

Fast John Ellipsoid Computation with Differential Privacy Optimization
by: Li, Xiaoyu, et al.
Published: (2024)

The Optimality of (Accelerated) SGD for High-Dimensional Quadratic Optimization
by: Zhang, Haihan, et al.
Published: (2024)

SGD with Partial Hessian for Deep Neural Networks Optimization
by: Sun, Ying, et al.
Published: (2024)

Optimal Projection-Free Adaptive SGD for Matrix Optimization
by: Kovalev, Dmitry
Published: (2026)

On the Provable Suboptimality of Momentum SGD in Nonstationary Stochastic Optimization
by: Sahu, Sharan, et al.
Published: (2026)

An Iterative Algorithm for Rescaled Hyperbolic Functions Regression
by: Gao, Yeqi, et al.
Published: (2023)

Adapt or Forget: Provable Tradeoffs Between Adam and SGD in Nonstationary Optimization
by: Sahu, Sharan, et al.
Published: (2026)

StoSignSGD: Unbiased Structural Stochasticity Fixes SignSGD for Training Large Language Models
by: Yu, Dingzhi, et al.
Published: (2026)