Saved in:
| Main Authors: | Wu, Xiaodong, Yu, Wenyi, Zhang, Chao, Woodland, Philip |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.06420 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Fisher-Orthogonal Projected Natural Gradient Descent for Continual Learning
by: Garg, Ishir, et al.
Published: (2026)
by: Garg, Ishir, et al.
Published: (2026)
Beyond the Mean: Fisher-Orthogonal Projection for Natural Gradient Descent in Large Batch Training
by: Lu, Yishun, et al.
Published: (2025)
by: Lu, Yishun, et al.
Published: (2025)
Fishers for Free? Approximating the Fisher Information Matrix by Recycling the Squared Gradient Accumulator
by: Li, YuXin, et al.
Published: (2025)
by: Li, YuXin, et al.
Published: (2025)
Improving Energy Natural Gradient Descent through Woodbury, Momentum, and Randomization
by: Guzmán-Cordero, Andrés, et al.
Published: (2025)
by: Guzmán-Cordero, Andrés, et al.
Published: (2025)
Thermodynamic Natural Gradient Descent
by: Donatella, Kaelan, et al.
Published: (2024)
by: Donatella, Kaelan, et al.
Published: (2024)
Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning
by: Huo, Yingxiao, et al.
Published: (2026)
by: Huo, Yingxiao, et al.
Published: (2026)
Kernel Approximation of Fisher-Rao Gradient Flows
by: Zhu, Jia-Jie, et al.
Published: (2024)
by: Zhu, Jia-Jie, et al.
Published: (2024)
ONG: Orthogonal Natural Gradient Descent
by: Yadav, Yajat, et al.
Published: (2025)
by: Yadav, Yajat, et al.
Published: (2025)
Randomness and Interpolation Improve Gradient Descent
by: Li, Jiawen, et al.
Published: (2025)
by: Li, Jiawen, et al.
Published: (2025)
Is All Learning (Natural) Gradient Descent?
by: Shoji, Lucas, et al.
Published: (2024)
by: Shoji, Lucas, et al.
Published: (2024)
Multi-head Temporal Latent Attention
by: Deng, Keqi, et al.
Published: (2025)
by: Deng, Keqi, et al.
Published: (2025)
Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency
by: Wu, Jingfeng, et al.
Published: (2024)
by: Wu, Jingfeng, et al.
Published: (2024)
Weighted Low-rank Approximation via Stochastic Gradient Descent on Manifolds
by: Xu, Conglong, et al.
Published: (2025)
by: Xu, Conglong, et al.
Published: (2025)
Learning Provably Improves the Convergence of Gradient Descent
by: Song, Qingyu, et al.
Published: (2025)
by: Song, Qingyu, et al.
Published: (2025)
Gauss-Newton Natural Gradient Descent for Shape Learning
by: King, James, et al.
Published: (2026)
by: King, James, et al.
Published: (2026)
Inversion-Free Natural Gradient Descent on Riemannian Manifolds
by: Draca, Dario, et al.
Published: (2026)
by: Draca, Dario, et al.
Published: (2026)
Towards Continuous-Time Approximations for Stochastic Gradient Descent without Replacement
by: Perko, Stefan
Published: (2025)
by: Perko, Stefan
Published: (2025)
Riemannian Laplace Approximation with the Fisher Metric
by: Yu, Hanlin, et al.
Published: (2023)
by: Yu, Hanlin, et al.
Published: (2023)
Using Taylor-Approximated Gradients to Improve the Frank-Wolfe Method for Empirical Risk Minimization
by: Xiong, Zikai, et al.
Published: (2022)
by: Xiong, Zikai, et al.
Published: (2022)
FGGM: Fisher-Guided Gradient Masking for Continual Learning
by: Tan, Chao-Hong, et al.
Published: (2026)
by: Tan, Chao-Hong, et al.
Published: (2026)
NysAct: A Scalable Preconditioned Gradient Descent using Nystrom Approximation
by: Seung, Hyunseok, et al.
Published: (2025)
by: Seung, Hyunseok, et al.
Published: (2025)
Approximation and Gradient Descent Training with Neural Networks
by: Welper, G.
Published: (2024)
by: Welper, G.
Published: (2024)
Benefits of Early Stopping in Gradient Descent for Overparameterized Logistic Regression
by: Wu, Jingfeng, et al.
Published: (2025)
by: Wu, Jingfeng, et al.
Published: (2025)
Natural Gradient Descent for Online Continual Learning
by: Khawand, Joe, et al.
Published: (2026)
by: Khawand, Joe, et al.
Published: (2026)
Distributed Gradient Descent for Functional Learning
by: Yu, Zhan, et al.
Published: (2023)
by: Yu, Zhan, et al.
Published: (2023)
Convergence Properties of Natural Gradient Descent for Minimizing KL Divergence
by: Datar, Adwait, et al.
Published: (2025)
by: Datar, Adwait, et al.
Published: (2025)
Natural Gradient Descent: Empirical Validation of Local Efficiency and Coordinate Invariance
by: HIDEKI
Published: (2025)
by: HIDEKI
Published: (2025)
How Does Label Noise Gradient Descent Improve Generalization in the Low SNR Regime?
by: Huang, Wei, et al.
Published: (2025)
by: Huang, Wei, et al.
Published: (2025)
Occam Gradient Descent
by: Kausik, B. N.
Published: (2024)
by: Kausik, B. N.
Published: (2024)
Revisiting Gradient Descent: A Dual-Weight Method for Improved Learning
by: Wang, Xi
Published: (2025)
by: Wang, Xi
Published: (2025)
Increasing Batch Size Improves Convergence of Stochastic Gradient Descent with Momentum
by: Kamo, Keisuke, et al.
Published: (2025)
by: Kamo, Keisuke, et al.
Published: (2025)
Revisiting Stochastic Approximation and Stochastic Gradient Descent
by: Karandikar, Rajeeva Laxman, et al.
Published: (2025)
by: Karandikar, Rajeeva Laxman, et al.
Published: (2025)
First and Second Order Approximations to Stochastic Gradient Descent Methods with Momentum Terms
by: Lu, Eric
Published: (2025)
by: Lu, Eric
Published: (2025)
Accelerating Natural Gradient Descent for PINNs with Randomized Numerical Linear Algebra
by: Bioli, Ivan, et al.
Published: (2025)
by: Bioli, Ivan, et al.
Published: (2025)
The Global Empirical NTK: Self-Referential Bias and Dimensionality of Gradient Descent Learning
by: Hazelden, James, et al.
Published: (2026)
by: Hazelden, James, et al.
Published: (2026)
Stochastic Adaptive Gradient Descent Without Descent
by: Aujol, Jean-François, et al.
Published: (2025)
by: Aujol, Jean-François, et al.
Published: (2025)
Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation
by: Lashkarashvili, Nineli, et al.
Published: (2024)
by: Lashkarashvili, Nineli, et al.
Published: (2024)
Gaussian Process Inference Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits
by: Chen, Hao, et al.
Published: (2021)
by: Chen, Hao, et al.
Published: (2021)
Stacking as Accelerated Gradient Descent
by: Agarwal, Naman, et al.
Published: (2024)
by: Agarwal, Naman, et al.
Published: (2024)
Convergence Analysis of Natural Gradient Descent for Over-parameterized Physics-Informed Neural Networks
by: Xu, Xianliang, et al.
Published: (2024)
by: Xu, Xianliang, et al.
Published: (2024)
Similar Items
-
Fisher-Orthogonal Projected Natural Gradient Descent for Continual Learning
by: Garg, Ishir, et al.
Published: (2026) -
Beyond the Mean: Fisher-Orthogonal Projection for Natural Gradient Descent in Large Batch Training
by: Lu, Yishun, et al.
Published: (2025) -
Fishers for Free? Approximating the Fisher Information Matrix by Recycling the Squared Gradient Accumulator
by: Li, YuXin, et al.
Published: (2025) -
Improving Energy Natural Gradient Descent through Woodbury, Momentum, and Randomization
by: Guzmán-Cordero, Andrés, et al.
Published: (2025) -
Thermodynamic Natural Gradient Descent
by: Donatella, Kaelan, et al.
Published: (2024)