:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Shaoqi, Yang, Chunjie, Lou, Siwei
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2409.15393
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Beyond the Mean: Fisher-Orthogonal Projection for Natural Gradient Descent in Large Batch Training
by: Lu, Yishun, et al.
Published: (2025)

Fisher-Orthogonal Projected Natural Gradient Descent for Continual Learning
by: Garg, Ishir, et al.
Published: (2026)

Improving Autoformalization Using Direct Dependency Retrieval
by: Wang, Shaoqi, et al.
Published: (2025)

ONG: Orthogonal Natural Gradient Descent
by: Yadav, Yajat, et al.
Published: (2025)

Training Data Selection with Gradient Orthogonality for Efficient Domain Adaptation
by: Zhang, Xiyang, et al.
Published: (2026)

Disentangling Task Conflicts in Multi-Task LoRA via Orthogonal Gradient Projection
by: Yang, Ziyu, et al.
Published: (2026)

Orthogonalized Policy Optimization:Policy Optimization as Orthogonal Projection in Hilbert Space
by: Zixian, Wang
Published: (2026)

ROOT: Robust Orthogonalized Optimizer for Neural Network Training
by: He, Wei, et al.
Published: (2025)

Backward-Friendly Optimization: Training Large Language Models with Approximate Gradients under Memory Constraints
by: Yang, Jing, et al.
Published: (2025)

Group Orthogonalized Policy Optimization:Group Policy Optimization as Orthogonal Projection in Hilbert Space
by: Zixian, Wang
Published: (2026)

Harnessing Orthogonality to Train Low-Rank Neural Networks
by: Coquelin, Daniel, et al.
Published: (2024)

Gradient Aligned Regression via Pairwise Losses
by: Zhu, Dixian, et al.
Published: (2024)

Gradient Weight-normalized Low-rank Projection for Efficient LLM Training
by: Huang, Jia-Hong, et al.
Published: (2024)

Evaluating and Designing Sparse Autoencoders by Approximating Quasi-Orthogonality
by: Lee, Sewoong, et al.
Published: (2025)

Clustering-Based Weight Orthogonalization for Stabilizing Deep Reinforcement Learning
by: Ma, Guoqing, et al.
Published: (2025)

Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning
by: Huo, Yingxiao, et al.
Published: (2026)

KAIROS: Unified Training for Universal Non-Autoregressive Time Series Forecasting
by: Ding, Kuiye, et al.
Published: (2025)

GradientStabilizer:Fix the Norm, Not the Gradient
by: Huang, Tianjin, et al.
Published: (2025)

Gradient Regularized Natural Gradients
by: Dash, Satya Prakash, et al.
Published: (2026)

Gradient-Free Training of Quantized Neural Networks
by: Cohen, Noa, et al.
Published: (2024)

Pro-KLShampoo: Projected KL-Shampoo with Whitening Recovered by Orthogonalization
by: Sun, Ruotong, et al.
Published: (2026)

GAC: Stabilizing Asynchronous RL Training for LLMs via Gradient Alignment Control
by: Xu, Haofeng, et al.
Published: (2026)

Enhancing Transformer-based models for Long Sequence Time Series Forecasting via Structured Matrix
by: Zhang, Zhicheng, et al.
Published: (2024)

Research and application of Transformer based anomaly detection model: A literature review
by: Ma, Mingrui, et al.
Published: (2024)

GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection
by: Su, DiJia, et al.
Published: (2025)

Lotus: Efficient LLM Training by Randomized Low-Rank Gradient Projection with Adaptive Subspace Switching
by: Miao, Tianhao, et al.
Published: (2026)

Automatic Stability and Recovery for Neural Network Training
by: Or, Barak
Published: (2026)

GoQuant: Geometric Orthogonal Residual Projection for Multiplier-Free Power-of-Two Transformer Quantization
by: Xiang, Maoyang, et al.
Published: (2026)

Unbiased Gradient Low-Rank Projection
by: Pan, Rui, et al.
Published: (2025)

Benign Overfitting for Regression with Trained Two-Layer ReLU Networks
by: Park, Junhyung, et al.
Published: (2024)

LPPG-RL: Lexicographically Projected Policy Gradient Reinforcement Learning with Subproblem Exploration
by: Qiu, Ruiyu, et al.
Published: (2025)

Quantized Approximately Orthogonal Recurrent Neural Networks
by: Foucault, Armand, et al.
Published: (2024)

Test-Time Training on Graphs with Large Language Models (LLMs)
by: Zhang, Jiaxin, et al.
Published: (2024)

Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks
by: Xiao, Mingqing, et al.
Published: (2024)

Reconstructing Deep Neural Networks: Unleashing the Optimization Potential of Natural Gradient Descent
by: Liu, Weihua, et al.
Published: (2024)

Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation
by: Che, Fengdi, et al.
Published: (2024)

Vertical Symbolic Regression via Deep Policy Gradient
by: Jiang, Nan, et al.
Published: (2024)

Orthogonal Subspace Projection for Continual Machine Unlearning via SVD-Based LoRA
by: Rahulamathavan, Yogachandran, et al.
Published: (2026)

Gradient-Congruity Guided Federated Sparse Training
by: Tian, Chris Xing, et al.
Published: (2024)

Matrix Low-Rank Approximation For Policy Gradient Methods
by: Rozada, Sergio, et al.
Published: (2024)