Saved in:
| Main Authors: | Kong, Boao, Liang, Junzhu, Liu, Yuxi, Deng, Renjia, Yuan, Kun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.18993 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RoPeSLR: 3D RoPE-driven Sparse-LowRank Attention for Efficient Diffusion Transformers
by: Liu, Yuxi, et al.
Published: (2026)
by: Liu, Yuxi, et al.
Published: (2026)
Synergistic Intra- and Cross-Layer Regularization Losses for MoE Expert Specialization
by: Hu, Rizhen, et al.
Published: (2026)
by: Hu, Rizhen, et al.
Published: (2026)
MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling
by: Liu, Yuxi, et al.
Published: (2025)
by: Liu, Yuxi, et al.
Published: (2025)
On the Convergence of Stochastic Gradient Descent with Perturbed Forward-Backward Passes
by: Kong, Boao, et al.
Published: (2026)
by: Kong, Boao, et al.
Published: (2026)
BROS: Bias-Corrected Randomized Subspaces for Memory-Efficient Single-Loop Bilevel Optimization
by: Zhang, Hengrui, et al.
Published: (2026)
by: Zhang, Hengrui, et al.
Published: (2026)
Row-Stochastic Matrices Can Provably Outperform Doubly Stochastic Matrices in Decentralized Learning
by: Liu, Bing, et al.
Published: (2025)
by: Liu, Bing, et al.
Published: (2025)
GNMR: Runtime Stability Control for Low-Precision Large Language Model Training
by: Kong, Boao, et al.
Published: (2026)
by: Kong, Boao, et al.
Published: (2026)
Clapping: Removing Per-sample Storage for Pipeline Parallel Distributed Optimization with Communication Compression
by: Kong, Boao, et al.
Published: (2025)
by: Kong, Boao, et al.
Published: (2025)
SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
by: Zhu, Shuchen, et al.
Published: (2024)
by: Zhu, Shuchen, et al.
Published: (2024)
Decentralized Bilevel Optimization: A Perspective from Transient Iteration Complexity
by: Kong, Boao, et al.
Published: (2024)
by: Kong, Boao, et al.
Published: (2024)
Structure-Preserving Network Compression Via Low-Rank Induced Training Through Linear Layers Composition
by: Zhang, Xitong, et al.
Published: (2024)
by: Zhang, Xitong, et al.
Published: (2024)
MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards
by: Wang, Sheng, et al.
Published: (2024)
by: Wang, Sheng, et al.
Published: (2024)
Boosting the Accuracy of Stock Market Prediction via Multi-Layer Hybrid MTL Structure
by: Hong, Yuxi
Published: (2025)
by: Hong, Yuxi
Published: (2025)
LaX: Boosting Low-Rank Training of Foundation Models via Latent Crossing
by: Zhang, Ruijie, et al.
Published: (2025)
by: Zhang, Ruijie, et al.
Published: (2025)
LoRAPrune: Structured Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning
by: Zhang, Mingyang, et al.
Published: (2023)
by: Zhang, Mingyang, et al.
Published: (2023)
Efficient Pareto Manifold Learning with Low-Rank Structure
by: Chen, Weiyu, et al.
Published: (2024)
by: Chen, Weiyu, et al.
Published: (2024)
On Catastrophic Forgetting in Low-Rank Decomposition-Based Parameter-Efficient Fine-Tuning
by: Ahmad, Muhammad, et al.
Published: (2026)
by: Ahmad, Muhammad, et al.
Published: (2026)
Joint Tensor-Train Parameterization for Efficient and Expressive Low-Rank Adaptation
by: Qi, Jun, et al.
Published: (2025)
by: Qi, Jun, et al.
Published: (2025)
Parameter Efficient Continual Learning with Dynamic Low-Rank Adaptation
by: Bhat, Prashant Shivaram, et al.
Published: (2025)
by: Bhat, Prashant Shivaram, et al.
Published: (2025)
Mixture of Distributions Matters: Dynamic Sparse Attention for Efficient Video Diffusion Transformers
by: Liu, Yuxi, et al.
Published: (2026)
by: Liu, Yuxi, et al.
Published: (2026)
ScaLoRA: Optimally Scaled Low-Rank Adaptation for Efficient High-Rank Fine-Tuning
by: Zhang, Yilang, et al.
Published: (2025)
by: Zhang, Yilang, et al.
Published: (2025)
Memory-Efficient LLM Training by Various-Grained Low-Rank Projection of Gradients
by: Wang, Yezhen, et al.
Published: (2025)
by: Wang, Yezhen, et al.
Published: (2025)
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
by: Zhao, Jiawei, et al.
Published: (2024)
by: Zhao, Jiawei, et al.
Published: (2024)
CLoRA: Parameter-Efficient Continual Learning with Low-Rank Adaptation
by: Muralidhara, Shishir, et al.
Published: (2025)
by: Muralidhara, Shishir, et al.
Published: (2025)
Enhancing Parameter Efficiency and Generalization in Large-Scale Models: A Regularized and Masked Low-Rank Adaptation Approach
by: Mao, Yuzhu, et al.
Published: (2024)
by: Mao, Yuzhu, et al.
Published: (2024)
Top-$k$ Feature Importance Ranking
by: Chen, Yuxi, et al.
Published: (2025)
by: Chen, Yuxi, et al.
Published: (2025)
A Memory Efficient Randomized Subspace Optimization Method for Training Large Language Models
by: Chen, Yiming, et al.
Published: (2025)
by: Chen, Yiming, et al.
Published: (2025)
On Generalization Bounds for Neural Networks with Low Rank Layers
by: Pinto, Andrea, et al.
Published: (2024)
by: Pinto, Andrea, et al.
Published: (2024)
DropLoRA: Sparse Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
by: Zhang, Haojie
Published: (2025)
by: Zhang, Haojie
Published: (2025)
GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
by: Jung, Yeonjoon, et al.
Published: (2025)
by: Jung, Yeonjoon, et al.
Published: (2025)
From PowerSGD to PowerSGD+: Low-Rank Gradient Compression for Distributed Optimization with Convergence Guarantees
by: Xie, Shengping, et al.
Published: (2025)
by: Xie, Shengping, et al.
Published: (2025)
CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation
by: Liu, Ziyue, et al.
Published: (2025)
by: Liu, Ziyue, et al.
Published: (2025)
Greedy Low-Rank Gradient Compression for Distributed Learning with Convergence Guarantees
by: Chen, Chuyan, et al.
Published: (2025)
by: Chen, Chuyan, et al.
Published: (2025)
Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning
by: Sehanobish, Arijit, et al.
Published: (2024)
by: Sehanobish, Arijit, et al.
Published: (2024)
An Overview of Low-Rank Structures in the Training and Adaptation of Large Models
by: Balzano, Laura, et al.
Published: (2025)
by: Balzano, Laura, et al.
Published: (2025)
Fast Forwarding Low-Rank Training
by: Rahamim, Adir, et al.
Published: (2024)
by: Rahamim, Adir, et al.
Published: (2024)
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
by: Jiang, Ting, et al.
Published: (2024)
by: Jiang, Ting, et al.
Published: (2024)
ReLaX-Net: Reusing Layers for Parameter-Efficient Physical Neural Networks
by: Tsuchiyama, Kohei, et al.
Published: (2025)
by: Tsuchiyama, Kohei, et al.
Published: (2025)
Low Rank Multi-Dictionary Selection at Scale
by: Ma, Boya, et al.
Published: (2024)
by: Ma, Boya, et al.
Published: (2024)
Efficient Generalized Low-Rank Tensor Contextual Bandits
by: Yi, Qianxin, et al.
Published: (2023)
by: Yi, Qianxin, et al.
Published: (2023)
Similar Items
-
RoPeSLR: 3D RoPE-driven Sparse-LowRank Attention for Efficient Diffusion Transformers
by: Liu, Yuxi, et al.
Published: (2026) -
Synergistic Intra- and Cross-Layer Regularization Losses for MoE Expert Specialization
by: Hu, Rizhen, et al.
Published: (2026) -
MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling
by: Liu, Yuxi, et al.
Published: (2025) -
On the Convergence of Stochastic Gradient Descent with Perturbed Forward-Backward Passes
by: Kong, Boao, et al.
Published: (2026) -
BROS: Bias-Corrected Randomized Subspaces for Memory-Efficient Single-Loop Bilevel Optimization
by: Zhang, Hengrui, et al.
Published: (2026)