:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yadav, Yajat, Mendoza, Patrick, Korrapati, Jathin
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2508.17169
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

VROOM - Visual Reconstruction over Onboard Multiview
by: Yadav, Yajat, et al.
Published: (2025)

Discrete vs. Continuous Trade-offs for Generative Models
by: Korrapati, Jathin, et al.
Published: (2024)

Can Transformers Break Encryption Schemes via In-Context Learning?
by: Korrapati, Jathin, et al.
Published: (2025)

Fisher-Orthogonal Projected Natural Gradient Descent for Continual Learning
by: Garg, Ishir, et al.
Published: (2026)

Beyond the Mean: Fisher-Orthogonal Projection for Natural Gradient Descent in Large Batch Training
by: Lu, Yishun, et al.
Published: (2025)

Reconstructing Deep Neural Networks: Unleashing the Optimization Potential of Natural Gradient Descent
by: Liu, Weihua, et al.
Published: (2024)

Natural Gradient Descent for Online Continual Learning
by: Khawand, Joe, et al.
Published: (2026)

Gradient Descent Algorithm Survey
by: Fucheng, Deng, et al.
Published: (2025)

Approximated Orthogonal Projection Unit: Stabilizing Regression Network Training Using Natural Gradient
by: Wang, Shaoqi, et al.
Published: (2024)

Randomness and Interpolation Improve Gradient Descent
by: Li, Jiawen, et al.
Published: (2025)

Learning Associative Memories with Gradient Descent
by: Cabannes, Vivien, et al.
Published: (2024)

Adaptive Heavy-Tailed Stochastic Gradient Descent
by: Gong, Bodu, et al.
Published: (2025)

Vanilla Gradient Descent for Oblique Decision Trees
by: Panda, Subrat Prasad, et al.
Published: (2024)

Stochastic Gradient Descent with Momentum is Algorithmically Stable
by: Lei, Yunwen, et al.
Published: (2026)

Gradient Descent Efficiency Index
by: Dhingra, Aviral
Published: (2024)

The Initialization Determines Whether In-Context Learning Is Gradient Descent
by: Xie, Shifeng, et al.
Published: (2025)

Elastic Multi-Gradient Descent for Parallel Continual Learning
by: Lyu, Fan, et al.
Published: (2024)

Revisiting the Initial Steps in Adaptive Gradient Descent Optimization
by: Abuduweili, Abulikemu, et al.
Published: (2024)

Efficient Search for Customized Activation Functions with Gradient Descent
by: Strack, Lukas, et al.
Published: (2024)

Conflict-Averse Gradient Descent for Multi-task Learning
by: Liu, Bo, et al.
Published: (2021)

Noise Balance and Stationary Distribution of Stochastic Gradient Descent
by: Ziyin, Liu, et al.
Published: (2023)

Can LLMs predict the convergence of Stochastic Gradient Descent?
by: Zekri, Oussama, et al.
Published: (2024)

Generalized Euler Logarithm and its Applications in Machine Learning: Natural Gradient, Backpropagation, Generalized EG, Mirror Descent and OLPS
by: Cichocki, Andrzej
Published: (2025)

Gradient Regularized Natural Gradients
by: Dash, Satya Prakash, et al.
Published: (2026)

Training Data Selection with Gradient Orthogonality for Efficient Domain Adaptation
by: Zhang, Xiyang, et al.
Published: (2026)

Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought
by: Huang, Jianhao, et al.
Published: (2025)

Geometrically Inspired Kernel Machines for Collaborative Learning Beyond Gradient Descent
by: Kumar, Mohit, et al.
Published: (2024)

GradTree: Learning Axis-Aligned Decision Trees with Gradient Descent
by: Marton, Sascha, et al.
Published: (2023)

Stochastic Re-weighted Gradient Descent via Distributionally Robust Optimization
by: Kumar, Ramnath, et al.
Published: (2023)

Provable Benefit of Sign Descent: A Minimal Model Under Heavy-Tailed Class Imbalance
by: Yadav, Robin, et al.
Published: (2025)

PSMGD: Periodic Stochastic Multi-Gradient Descent for Fast Multi-Objective Optimization
by: Xu, Mingjing, et al.
Published: (2024)

Optimization, Generalization and Differential Privacy Bounds for Gradient Descent on Kolmogorov-Arnold Networks
by: Wang, Puyu, et al.
Published: (2026)

Almost Bayesian: The Fractal Dynamics of Stochastic Gradient Descent
by: Hennick, Max, et al.
Published: (2025)

Finite-Time Analysis of Gradient Descent for Shallow Transformers
by: Arda, Enes, et al.
Published: (2026)

On the Convergence of (Stochastic) Gradient Descent for Kolmogorov--Arnold Networks
by: Gao, Yihang, et al.
Published: (2024)

Do pretrained Transformers Learn In-Context by Gradient Descent?
by: Shen, Lingfeng, et al.
Published: (2023)

Turning Stale Gradients into Stable Gradients: Coherent Coordinate Descent with Implicit Landscape Smoothing for Lightweight Zeroth-Order Optimization
by: Liang, Chen, et al.
Published: (2026)

Sven: Singular Value Descent as a Computationally Efficient Natural Gradient Method
by: Bright-Thonney, Samuel, et al.
Published: (2026)

FedBCD:Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning
by: Liu, Junkang, et al.
Published: (2026)

Auto-Unrolled Proximal Gradient Descent: An AutoML Approach to Interpretable Waveform Optimization
by: Kaplan, Ahmet
Published: (2026)