Saved in:
| Main Authors: | Song, Zhao, Yue, Song |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2305.08001 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Fast Inference with Kronecker-Sparse Matrices
by: Gonon, Antoine, et al.
Published: (2024)
by: Gonon, Antoine, et al.
Published: (2024)
Rethinking Bregman Divergences in Kronecker-Factored Optimizers
by: Liu, Bing, et al.
Published: (2026)
by: Liu, Bing, et al.
Published: (2026)
Quantum Speedup for Spectral Approximation of Kronecker Products
by: Gao, Yeqi, et al.
Published: (2024)
by: Gao, Yeqi, et al.
Published: (2024)
Scaling Laws of SignSGD in Linear Regression: When Does It Outperform SGD?
by: Kim, Jihwan, et al.
Published: (2026)
by: Kim, Jihwan, et al.
Published: (2026)
Does SGD really happen in tiny subspaces?
by: Song, Minhak, et al.
Published: (2024)
by: Song, Minhak, et al.
Published: (2024)
Topology-aware Generalization of Decentralized SGD
by: Zhu, Tongtian, et al.
Published: (2022)
by: Zhu, Tongtian, et al.
Published: (2022)
Fast and Efficient Matching Algorithm with Deadline Instances
by: Song, Zhao, et al.
Published: (2023)
by: Song, Zhao, et al.
Published: (2023)
Generalization and Optimization of SGD with Lookahead
by: Li, Kangcheng, et al.
Published: (2025)
by: Li, Kangcheng, et al.
Published: (2025)
Kronecker-Structured Nonparametric Spatiotemporal Point Processes
by: Xu, Zhitong, et al.
Published: (2026)
by: Xu, Zhitong, et al.
Published: (2026)
Scalable Gaussian Processes with Latent Kronecker Structure
by: Lin, Jihao Andreas, et al.
Published: (2025)
by: Lin, Jihao Andreas, et al.
Published: (2025)
Bootstrap SGD: Algorithmic Stability and Robustness
by: Christmann, Andreas, et al.
Published: (2024)
by: Christmann, Andreas, et al.
Published: (2024)
A Short Note on Batch-efficient Divide-and-Conquer Algorithm for EigenDecomposition
by: Song, Yue
Published: (2026)
by: Song, Yue
Published: (2026)
The ADMM-PINNs Algorithmic Framework for Nonsmooth PDE-Constrained Optimization: A Deep Learning Approach
by: Song, Yongcun, et al.
Published: (2023)
by: Song, Yongcun, et al.
Published: (2023)
Diagonalisation SGD: Fast & Convergent SGD for Non-Differentiable Models via Reparameterisation and Smoothing
by: Wagner, Dominik, et al.
Published: (2024)
by: Wagner, Dominik, et al.
Published: (2024)
Improved Stability and Generalization Guarantees of the Decentralized SGD Algorithm
by: Bars, Batiste Le, et al.
Published: (2023)
by: Bars, Batiste Le, et al.
Published: (2023)
Non-Euclidean SGD for Structured Optimization: Unified Analysis and Improved Rates
by: Kovalev, Dmitry, et al.
Published: (2025)
by: Kovalev, Dmitry, et al.
Published: (2025)
Faster Algorithms for Structured Linear and Kernel Support Vector Machines
by: Gu, Yuzhou, et al.
Published: (2023)
by: Gu, Yuzhou, et al.
Published: (2023)
Higher-Order Transformers With Kronecker-Structured Attention
by: Omranpour, Soroush, et al.
Published: (2024)
by: Omranpour, Soroush, et al.
Published: (2024)
Truncated Non-Uniform Quantization for Distributed SGD
by: Yan, Guangfeng, et al.
Published: (2024)
by: Yan, Guangfeng, et al.
Published: (2024)
SLowcal-SGD: Slow Query Points Improve Local-SGD for Stochastic Convex Optimization
by: Dahan, Tehila, et al.
Published: (2023)
by: Dahan, Tehila, et al.
Published: (2023)
Fast RoPE Attention: Combining the Polynomial Method and Fast Fourier Transform
by: Alman, Josh, et al.
Published: (2025)
by: Alman, Josh, et al.
Published: (2025)
Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime
by: Attia, Amit, et al.
Published: (2025)
by: Attia, Amit, et al.
Published: (2025)
Learning Kronecker-Structured Graphs from Smooth Signals
by: Shi, Changhao, et al.
Published: (2025)
by: Shi, Changhao, et al.
Published: (2025)
SketchySGD: Reliable Stochastic Optimization via Randomized Curvature Estimates
by: Frangella, Zachary, et al.
Published: (2022)
by: Frangella, Zachary, et al.
Published: (2022)
Suspicious Alignment of SGD: A Fine-Grained Step Size Condition Analysis
by: Deng, Shenyang, et al.
Published: (2026)
by: Deng, Shenyang, et al.
Published: (2026)
The Optimization Landscape of SGD Across the Feature Learning Strength
by: Atanasov, Alexander, et al.
Published: (2024)
by: Atanasov, Alexander, et al.
Published: (2024)
Stochastic Resetting Mitigates Latent Gradient Bias of SGD from Label Noise
by: Bae, Youngkyoung, et al.
Published: (2024)
by: Bae, Youngkyoung, et al.
Published: (2024)
Trustworthy Efficient Communication for Distributed Learning using LQ-SGD Algorithm
by: Li, Hongyang, et al.
Published: (2025)
by: Li, Hongyang, et al.
Published: (2025)
Sign-SGD via Parameter-Free Optimization
by: Medyakov, Daniil, et al.
Published: (2025)
by: Medyakov, Daniil, et al.
Published: (2025)
From PowerSGD to PowerSGD+: Low-Rank Gradient Compression for Distributed Optimization with Convergence Guarantees
by: Xie, Shengping, et al.
Published: (2025)
by: Xie, Shengping, et al.
Published: (2025)
Minibatch and Local SGD: Algorithmic Stability and Linear Speedup in Generalization
by: Lei, Yunwen, et al.
Published: (2023)
by: Lei, Yunwen, et al.
Published: (2023)
Scaling Gaussian Processes for Learning Curve Prediction via Latent Kronecker Structure
by: Lin, Jihao Andreas, et al.
Published: (2024)
by: Lin, Jihao Andreas, et al.
Published: (2024)
Fast John Ellipsoid Computation with Differential Privacy Optimization
by: Li, Xiaoyu, et al.
Published: (2024)
by: Li, Xiaoyu, et al.
Published: (2024)
The Optimality of (Accelerated) SGD for High-Dimensional Quadratic Optimization
by: Zhang, Haihan, et al.
Published: (2024)
by: Zhang, Haihan, et al.
Published: (2024)
SGD with Partial Hessian for Deep Neural Networks Optimization
by: Sun, Ying, et al.
Published: (2024)
by: Sun, Ying, et al.
Published: (2024)
Optimal Projection-Free Adaptive SGD for Matrix Optimization
by: Kovalev, Dmitry
Published: (2026)
by: Kovalev, Dmitry
Published: (2026)
On the Provable Suboptimality of Momentum SGD in Nonstationary Stochastic Optimization
by: Sahu, Sharan, et al.
Published: (2026)
by: Sahu, Sharan, et al.
Published: (2026)
An Iterative Algorithm for Rescaled Hyperbolic Functions Regression
by: Gao, Yeqi, et al.
Published: (2023)
by: Gao, Yeqi, et al.
Published: (2023)
Adapt or Forget: Provable Tradeoffs Between Adam and SGD in Nonstationary Optimization
by: Sahu, Sharan, et al.
Published: (2026)
by: Sahu, Sharan, et al.
Published: (2026)
StoSignSGD: Unbiased Structural Stochasticity Fixes SignSGD for Training Large Language Models
by: Yu, Dingzhi, et al.
Published: (2026)
by: Yu, Dingzhi, et al.
Published: (2026)
Similar Items
-
Fast Inference with Kronecker-Sparse Matrices
by: Gonon, Antoine, et al.
Published: (2024) -
Rethinking Bregman Divergences in Kronecker-Factored Optimizers
by: Liu, Bing, et al.
Published: (2026) -
Quantum Speedup for Spectral Approximation of Kronecker Products
by: Gao, Yeqi, et al.
Published: (2024) -
Scaling Laws of SignSGD in Linear Regression: When Does It Outperform SGD?
by: Kim, Jihwan, et al.
Published: (2026) -
Does SGD really happen in tiny subspaces?
by: Song, Minhak, et al.
Published: (2024)