Saved in:
| Main Authors: | Wang, Shaoqi, Yang, Chunjie, Lou, Siwei |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.15393 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Beyond the Mean: Fisher-Orthogonal Projection for Natural Gradient Descent in Large Batch Training
by: Lu, Yishun, et al.
Published: (2025)
by: Lu, Yishun, et al.
Published: (2025)
Fisher-Orthogonal Projected Natural Gradient Descent for Continual Learning
by: Garg, Ishir, et al.
Published: (2026)
by: Garg, Ishir, et al.
Published: (2026)
Improving Autoformalization Using Direct Dependency Retrieval
by: Wang, Shaoqi, et al.
Published: (2025)
by: Wang, Shaoqi, et al.
Published: (2025)
ONG: Orthogonal Natural Gradient Descent
by: Yadav, Yajat, et al.
Published: (2025)
by: Yadav, Yajat, et al.
Published: (2025)
Training Data Selection with Gradient Orthogonality for Efficient Domain Adaptation
by: Zhang, Xiyang, et al.
Published: (2026)
by: Zhang, Xiyang, et al.
Published: (2026)
Disentangling Task Conflicts in Multi-Task LoRA via Orthogonal Gradient Projection
by: Yang, Ziyu, et al.
Published: (2026)
by: Yang, Ziyu, et al.
Published: (2026)
Orthogonalized Policy Optimization:Policy Optimization as Orthogonal Projection in Hilbert Space
by: Zixian, Wang
Published: (2026)
by: Zixian, Wang
Published: (2026)
ROOT: Robust Orthogonalized Optimizer for Neural Network Training
by: He, Wei, et al.
Published: (2025)
by: He, Wei, et al.
Published: (2025)
Backward-Friendly Optimization: Training Large Language Models with Approximate Gradients under Memory Constraints
by: Yang, Jing, et al.
Published: (2025)
by: Yang, Jing, et al.
Published: (2025)
Group Orthogonalized Policy Optimization:Group Policy Optimization as Orthogonal Projection in Hilbert Space
by: Zixian, Wang
Published: (2026)
by: Zixian, Wang
Published: (2026)
Harnessing Orthogonality to Train Low-Rank Neural Networks
by: Coquelin, Daniel, et al.
Published: (2024)
by: Coquelin, Daniel, et al.
Published: (2024)
Gradient Aligned Regression via Pairwise Losses
by: Zhu, Dixian, et al.
Published: (2024)
by: Zhu, Dixian, et al.
Published: (2024)
Gradient Weight-normalized Low-rank Projection for Efficient LLM Training
by: Huang, Jia-Hong, et al.
Published: (2024)
by: Huang, Jia-Hong, et al.
Published: (2024)
Evaluating and Designing Sparse Autoencoders by Approximating Quasi-Orthogonality
by: Lee, Sewoong, et al.
Published: (2025)
by: Lee, Sewoong, et al.
Published: (2025)
Clustering-Based Weight Orthogonalization for Stabilizing Deep Reinforcement Learning
by: Ma, Guoqing, et al.
Published: (2025)
by: Ma, Guoqing, et al.
Published: (2025)
Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning
by: Huo, Yingxiao, et al.
Published: (2026)
by: Huo, Yingxiao, et al.
Published: (2026)
KAIROS: Unified Training for Universal Non-Autoregressive Time Series Forecasting
by: Ding, Kuiye, et al.
Published: (2025)
by: Ding, Kuiye, et al.
Published: (2025)
GradientStabilizer:Fix the Norm, Not the Gradient
by: Huang, Tianjin, et al.
Published: (2025)
by: Huang, Tianjin, et al.
Published: (2025)
Gradient Regularized Natural Gradients
by: Dash, Satya Prakash, et al.
Published: (2026)
by: Dash, Satya Prakash, et al.
Published: (2026)
Gradient-Free Training of Quantized Neural Networks
by: Cohen, Noa, et al.
Published: (2024)
by: Cohen, Noa, et al.
Published: (2024)
Pro-KLShampoo: Projected KL-Shampoo with Whitening Recovered by Orthogonalization
by: Sun, Ruotong, et al.
Published: (2026)
by: Sun, Ruotong, et al.
Published: (2026)
GAC: Stabilizing Asynchronous RL Training for LLMs via Gradient Alignment Control
by: Xu, Haofeng, et al.
Published: (2026)
by: Xu, Haofeng, et al.
Published: (2026)
Enhancing Transformer-based models for Long Sequence Time Series Forecasting via Structured Matrix
by: Zhang, Zhicheng, et al.
Published: (2024)
by: Zhang, Zhicheng, et al.
Published: (2024)
Research and application of Transformer based anomaly detection model: A literature review
by: Ma, Mingrui, et al.
Published: (2024)
by: Ma, Mingrui, et al.
Published: (2024)
GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection
by: Su, DiJia, et al.
Published: (2025)
by: Su, DiJia, et al.
Published: (2025)
Lotus: Efficient LLM Training by Randomized Low-Rank Gradient Projection with Adaptive Subspace Switching
by: Miao, Tianhao, et al.
Published: (2026)
by: Miao, Tianhao, et al.
Published: (2026)
Automatic Stability and Recovery for Neural Network Training
by: Or, Barak
Published: (2026)
by: Or, Barak
Published: (2026)
GoQuant: Geometric Orthogonal Residual Projection for Multiplier-Free Power-of-Two Transformer Quantization
by: Xiang, Maoyang, et al.
Published: (2026)
by: Xiang, Maoyang, et al.
Published: (2026)
Unbiased Gradient Low-Rank Projection
by: Pan, Rui, et al.
Published: (2025)
by: Pan, Rui, et al.
Published: (2025)
Benign Overfitting for Regression with Trained Two-Layer ReLU Networks
by: Park, Junhyung, et al.
Published: (2024)
by: Park, Junhyung, et al.
Published: (2024)
LPPG-RL: Lexicographically Projected Policy Gradient Reinforcement Learning with Subproblem Exploration
by: Qiu, Ruiyu, et al.
Published: (2025)
by: Qiu, Ruiyu, et al.
Published: (2025)
Quantized Approximately Orthogonal Recurrent Neural Networks
by: Foucault, Armand, et al.
Published: (2024)
by: Foucault, Armand, et al.
Published: (2024)
Test-Time Training on Graphs with Large Language Models (LLMs)
by: Zhang, Jiaxin, et al.
Published: (2024)
by: Zhang, Jiaxin, et al.
Published: (2024)
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks
by: Xiao, Mingqing, et al.
Published: (2024)
by: Xiao, Mingqing, et al.
Published: (2024)
Reconstructing Deep Neural Networks: Unleashing the Optimization Potential of Natural Gradient Descent
by: Liu, Weihua, et al.
Published: (2024)
by: Liu, Weihua, et al.
Published: (2024)
Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation
by: Che, Fengdi, et al.
Published: (2024)
by: Che, Fengdi, et al.
Published: (2024)
Vertical Symbolic Regression via Deep Policy Gradient
by: Jiang, Nan, et al.
Published: (2024)
by: Jiang, Nan, et al.
Published: (2024)
Orthogonal Subspace Projection for Continual Machine Unlearning via SVD-Based LoRA
by: Rahulamathavan, Yogachandran, et al.
Published: (2026)
by: Rahulamathavan, Yogachandran, et al.
Published: (2026)
Gradient-Congruity Guided Federated Sparse Training
by: Tian, Chris Xing, et al.
Published: (2024)
by: Tian, Chris Xing, et al.
Published: (2024)
Matrix Low-Rank Approximation For Policy Gradient Methods
by: Rozada, Sergio, et al.
Published: (2024)
by: Rozada, Sergio, et al.
Published: (2024)
Similar Items
-
Beyond the Mean: Fisher-Orthogonal Projection for Natural Gradient Descent in Large Batch Training
by: Lu, Yishun, et al.
Published: (2025) -
Fisher-Orthogonal Projected Natural Gradient Descent for Continual Learning
by: Garg, Ishir, et al.
Published: (2026) -
Improving Autoformalization Using Direct Dependency Retrieval
by: Wang, Shaoqi, et al.
Published: (2025) -
ONG: Orthogonal Natural Gradient Descent
by: Yadav, Yajat, et al.
Published: (2025) -
Training Data Selection with Gradient Orthogonality for Efficient Domain Adaptation
by: Zhang, Xiyang, et al.
Published: (2026)