Saved in:
| Main Authors: | Jovanović, Andrej, Iacob, Alex, Safaryan, Mher, Modoranu, Ionut-Vlad, Sani, Lorenzo, Shen, William F., Qiu, Xinchi, Alistarh, Dan, Lane, Nicholas D. |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.04396 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning
by: Modoranu, Ionut-Vlad, et al.
Published: (2026)
by: Modoranu, Ionut-Vlad, et al.
Published: (2026)
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
by: Robert, Thomas, et al.
Published: (2024)
by: Robert, Thomas, et al.
Published: (2024)
FFT-based Dynamic Subspace Selection for Low-Rank Adaptive Optimization of Large Language Models
by: Modoranu, Ionut-Vlad, et al.
Published: (2025)
by: Modoranu, Ionut-Vlad, et al.
Published: (2025)
The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information
by: Wu, Diyuan, et al.
Published: (2024)
by: Wu, Diyuan, et al.
Published: (2024)
DASH: Faster Shampoo via Batched Block Preconditioning and Efficient Inverse-Root Solvers
by: Modoranu, Ionut-Vlad, et al.
Published: (2026)
by: Modoranu, Ionut-Vlad, et al.
Published: (2026)
MT-DAO: Multi-Timescale Distributed Adaptive Optimizers with Local Updates
by: Iacob, Alex, et al.
Published: (2025)
by: Iacob, Alex, et al.
Published: (2025)
Unified Scaling Laws for Compressed Representations
by: Panferov, Andrei, et al.
Published: (2025)
by: Panferov, Andrei, et al.
Published: (2025)
DES-LOC: Desynced Low Communication Adaptive Optimizers for Training Foundation Models
by: Iacob, Alex, et al.
Published: (2025)
by: Iacob, Alex, et al.
Published: (2025)
MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence
by: Modoranu, Ionut-Vlad, et al.
Published: (2024)
by: Modoranu, Ionut-Vlad, et al.
Published: (2024)
Error Feedback Can Accurately Compress Preconditioners
by: Modoranu, Ionut-Vlad, et al.
Published: (2023)
by: Modoranu, Ionut-Vlad, et al.
Published: (2023)
Towards Robust Scaling Laws for Optimizers
by: Volkova, Alexandra, et al.
Published: (2026)
by: Volkova, Alexandra, et al.
Published: (2026)
CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training
by: Tabesh, Soroush, et al.
Published: (2025)
by: Tabesh, Soroush, et al.
Published: (2025)
DEPT: Decoupled Embeddings for Pre-training Language Models
by: Iacob, Alex, et al.
Published: (2024)
by: Iacob, Alex, et al.
Published: (2024)
LLM Unlearning via Neural Activation Redirection
by: Shen, William F., et al.
Published: (2025)
by: Shen, William F., et al.
Published: (2025)
Position: It's Time to Act on the Risk of Efficient Personalized Text Generation
by: Iofinova, Eugenia, et al.
Published: (2025)
by: Iofinova, Eugenia, et al.
Published: (2025)
GradSkip: Communication-Accelerated Local Gradient Methods with Better Computational Complexity
by: Maranjyan, Artavazd, et al.
Published: (2022)
by: Maranjyan, Artavazd, et al.
Published: (2022)
Sheaf HyperNetworks for Personalized Federated Learning
by: Nguyen, Bao, et al.
Published: (2024)
by: Nguyen, Bao, et al.
Published: (2024)
Pollen: High-throughput Federated Learning Simulation via Resource-Aware Client Placement
by: Sani, Lorenzo, et al.
Published: (2023)
by: Sani, Lorenzo, et al.
Published: (2023)
FedAnchor: Enhancing Federated Semi-Supervised Learning with Label Contrastive Loss for Unlabeled Clients
by: Qiu, Xinchi, et al.
Published: (2024)
by: Qiu, Xinchi, et al.
Published: (2024)
On Biased Compression for Distributed Learning
by: Beznosikov, Aleksandr, et al.
Published: (2020)
by: Beznosikov, Aleksandr, et al.
Published: (2020)
SparsyFed: Sparse Adaptive Federated Training
by: Guastella, Adriano, et al.
Published: (2025)
by: Guastella, Adriano, et al.
Published: (2025)
Optimizers Qualitatively Alter Solutions And We Should Leverage This
by: Pascanu, Razvan, et al.
Published: (2025)
by: Pascanu, Razvan, et al.
Published: (2025)
Photon: Federated LLM Pre-Training
by: Sani, Lorenzo, et al.
Published: (2024)
by: Sani, Lorenzo, et al.
Published: (2024)
AbbIE: Autoregressive Block-Based Iterative Encoder for Efficient Sequence Modeling
by: Aleksandrov, Preslav, et al.
Published: (2025)
by: Aleksandrov, Preslav, et al.
Published: (2025)
The Future of Large Language Model Pre-training is Federated
by: Sani, Lorenzo, et al.
Published: (2024)
by: Sani, Lorenzo, et al.
Published: (2024)
Worldwide Federated Training of Language Models
by: Iacob, Alex, et al.
Published: (2024)
by: Iacob, Alex, et al.
Published: (2024)
Response-Conditioned Parallel-to-Sequential Orchestration for Multi-Agent Systems
by: Tastan, Nurbek, et al.
Published: (2026)
by: Tastan, Nurbek, et al.
Published: (2026)
SEAT: Sparse Entity-Aware Tuning for Knowledge Adaptation while Preserving Epistemic Abstention
by: Shen, William F., et al.
Published: (2025)
by: Shen, William F., et al.
Published: (2025)
Gradient-less Federated Gradient Boosting Trees with Learnable Learning Rates
by: Ma, Chenyang, et al.
Published: (2023)
by: Ma, Chenyang, et al.
Published: (2023)
Breaking Physical and Linguistic Borders: Multilingual Federated Prompt Tuning for Low-Resource Languages
by: Zhao, Wanru, et al.
Published: (2025)
by: Zhao, Wanru, et al.
Published: (2025)
Panza: Design and Analysis of a Fully-Local Personalized Text Writing Assistant
by: Nicolicioiu, Armand, et al.
Published: (2024)
by: Nicolicioiu, Armand, et al.
Published: (2024)
Hymnographic Indicators of the Armenian Renaissance
by: Mher Navoyan
Published: (2026)
by: Mher Navoyan
Published: (2026)
Speculative Decoding Speed-of-Light: Optimal Lower Bounds via Branching Random Walks
by: Pankratov, Sergey, et al.
Published: (2025)
by: Pankratov, Sergey, et al.
Published: (2025)
Simple Opinion Dynamics for No-Regret Learning
by: Lazarsfeld, John, et al.
Published: (2023)
by: Lazarsfeld, John, et al.
Published: (2023)
Behemoth: Benchmarking Unlearning in LLMs Using Fully Synthetic Data
by: Iofinova, Eugenia, et al.
Published: (2026)
by: Iofinova, Eugenia, et al.
Published: (2026)
LLMQ: Efficient Lower-Precision Pretraining for Consumer GPUs
by: Schultheis, Erik, et al.
Published: (2025)
by: Schultheis, Erik, et al.
Published: (2025)
Model Compression with Exact Budget Constraints via Riemannian Manifolds
by: Helcig, Michael, et al.
Published: (2026)
by: Helcig, Michael, et al.
Published: (2026)
Communication-Efficient Federated Learning With Data and Client Heterogeneity
by: Zakerinia, Hossein, et al.
Published: (2022)
by: Zakerinia, Hossein, et al.
Published: (2022)
Optimizing Classification of Infrequent Labels by Reducing Variability in Label Distribution
by: Agarwal, Ashutosh
Published: (2025)
by: Agarwal, Ashutosh
Published: (2025)
Deep Unrolling of Sparsity-Induced RDO for 3D Point Cloud Attribute Coding
by: Do, Tam Thuc, et al.
Published: (2025)
by: Do, Tam Thuc, et al.
Published: (2025)
Similar Items
-
MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning
by: Modoranu, Ionut-Vlad, et al.
Published: (2026) -
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
by: Robert, Thomas, et al.
Published: (2024) -
FFT-based Dynamic Subspace Selection for Low-Rank Adaptive Optimization of Large Language Models
by: Modoranu, Ionut-Vlad, et al.
Published: (2025) -
The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information
by: Wu, Diyuan, et al.
Published: (2024) -
DASH: Faster Shampoo via Batched Block Preconditioning and Efficient Inverse-Root Solvers
by: Modoranu, Ionut-Vlad, et al.
Published: (2026)