Saved in:
| Main Authors: | Zmushko, Philip, Beznosikov, Aleksandr, Takáč, Martin, Horváth, Samuel |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.07837 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LionMuon: Alternating Spectral and Sign Descent for Efficient Training
by: Bolatov, Arman, et al.
Published: (2026)
by: Bolatov, Arman, et al.
Published: (2026)
Random-reshuffled SARAH does not need a full gradient computations
by: Beznosikov, Aleksandr, et al.
Published: (2021)
by: Beznosikov, Aleksandr, et al.
Published: (2021)
Label Privacy in Split Learning for Large Models with Parameter-Efficient Training
by: Zmushko, Philip, et al.
Published: (2024)
by: Zmushko, Philip, et al.
Published: (2024)
Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning
by: Semenov, Andrei, et al.
Published: (2024)
by: Semenov, Andrei, et al.
Published: (2024)
Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods
by: Veprikov, Andrey, et al.
Published: (2025)
by: Veprikov, Andrey, et al.
Published: (2025)
Faster Than SVD, Smarter Than SGD: The OPLoRA Alternating Update
by: Almansoori, Abdulla Jasem, et al.
Published: (2025)
by: Almansoori, Abdulla Jasem, et al.
Published: (2025)
Beyond SGD, Without SVD: Proximal Subspace Iteration LoRA with Diagonal Fractional K-FAC
by: Almansoori, Abdulla Jasem, et al.
Published: (2026)
by: Almansoori, Abdulla Jasem, et al.
Published: (2026)
Similarity, Compression and Local Steps: Three Pillars of Efficient Communications for Distributed Variational Inequalities
by: Beznosikov, Aleksandr, et al.
Published: (2023)
by: Beznosikov, Aleksandr, et al.
Published: (2023)
Sign-SGD via Parameter-Free Optimization
by: Medyakov, Daniil, et al.
Published: (2025)
by: Medyakov, Daniil, et al.
Published: (2025)
Convergence of Clipped-SGD for Convex $(L_0,L_1)$-Smooth Optimization with Heavy-Tailed Noise
by: Chezhegov, Savelii, et al.
Published: (2025)
by: Chezhegov, Savelii, et al.
Published: (2025)
Where Does Warm-Up Come From? Adaptive Scheduling for Norm-Constrained Optimizers
by: Riabinin, Artem, et al.
Published: (2026)
by: Riabinin, Artem, et al.
Published: (2026)
Clipping Improves Adam-Norm and AdaGrad-Norm when the Noise Is Heavy-Tailed
by: Chezhegov, Savelii, et al.
Published: (2024)
by: Chezhegov, Savelii, et al.
Published: (2024)
Collaborative and Efficient Personalization with Mixtures of Adaptors
by: Almansoori, Abdulla Jasem, et al.
Published: (2024)
by: Almansoori, Abdulla Jasem, et al.
Published: (2024)
AdaFRUGAL: Adaptive Memory-Efficient Training with Dynamic Control
by: Bui, Quang-Hung, et al.
Published: (2025)
by: Bui, Quang-Hung, et al.
Published: (2025)
On Biased Compression for Distributed Learning
by: Beznosikov, Aleksandr, et al.
Published: (2020)
by: Beznosikov, Aleksandr, et al.
Published: (2020)
Byzantine-Robust Optimization under $(L_0, L_1)$-Smoothness
by: Bolatov, Arman, et al.
Published: (2026)
by: Bolatov, Arman, et al.
Published: (2026)
Sign Operator for Coping with Heavy-Tailed Noise in Non-Convex Optimization: High Probability Bounds Under $(L_0, L_1)$-Smoothness
by: Kornilov, Nikita, et al.
Published: (2025)
by: Kornilov, Nikita, et al.
Published: (2025)
Thinking like a CHEMIST: Combined Heterogeneous Embedding Model Integrating Structure and Tokens
by: Rekut, Nikolai, et al.
Published: (2025)
by: Rekut, Nikolai, et al.
Published: (2025)
Generalising Battery Control in Net-Zero Buildings via Personalised Federated RL
by: Avila, Nicolas M Cuadrado, et al.
Published: (2024)
by: Avila, Nicolas M Cuadrado, et al.
Published: (2024)
Ito Diffusion Approximation of Universal Ito Chains for Sampling, Optimization and Boosting
by: Ustimenko, Aleksei, et al.
Published: (2023)
by: Ustimenko, Aleksei, et al.
Published: (2023)
PaDPaF: Partial Disentanglement with Partially-Federated GANs
by: Almansoori, Abdulla Jasem, et al.
Published: (2022)
by: Almansoori, Abdulla Jasem, et al.
Published: (2022)
Stochastic Gradient Methods with Preconditioned Updates
by: Sadiev, Abdurakhmon, et al.
Published: (2022)
by: Sadiev, Abdurakhmon, et al.
Published: (2022)
Federated Learning Can Find Friends That Are Advantageous
by: Tupitsa, Nazarii, et al.
Published: (2024)
by: Tupitsa, Nazarii, et al.
Published: (2024)
Accelerated Stochastic ExtraGradient: Mixing Hessian and Gradient Similarity to Reduce Communication in Distributed and Federated Learning
by: Bylinkin, Dmitry, et al.
Published: (2024)
by: Bylinkin, Dmitry, et al.
Published: (2024)
Generalized Policy Learning for Smart Grids: FL TRPO Approach
by: Li, Yunxiang, et al.
Published: (2024)
by: Li, Yunxiang, et al.
Published: (2024)
Accelerated Methods with Compressed Communications for Distributed Optimization Problems under Data Similarity
by: Bylinkin, Dmitry, et al.
Published: (2024)
by: Bylinkin, Dmitry, et al.
Published: (2024)
FedPeWS: Personalized Warmup via Subnetworks for Enhanced Heterogeneous Federated Learning
by: Tastan, Nurbek, et al.
Published: (2024)
by: Tastan, Nurbek, et al.
Published: (2024)
What Scalable Second-Order Information Knows for Pruning at Initialization
by: Navarrete, Ivo Gollini, et al.
Published: (2025)
by: Navarrete, Ivo Gollini, et al.
Published: (2025)
Efficient Conformal Prediction under Data Heterogeneity
by: Plassier, Vincent, et al.
Published: (2023)
by: Plassier, Vincent, et al.
Published: (2023)
Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features
by: Beznosikov, Aleksandr, et al.
Published: (2023)
by: Beznosikov, Aleksandr, et al.
Published: (2023)
Hierarchical Mixture-of-Experts with Two-Stage Optimization
by: Molodtsov, Gleb, et al.
Published: (2026)
by: Molodtsov, Gleb, et al.
Published: (2026)
Decentralized Personalized Federated Learning for Min-Max Problems
by: Borodich, Ekaterina, et al.
Published: (2021)
by: Borodich, Ekaterina, et al.
Published: (2021)
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
by: Choudhury, Sayantan, et al.
Published: (2024)
by: Choudhury, Sayantan, et al.
Published: (2024)
LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning
by: Tastan, Nurbek, et al.
Published: (2025)
by: Tastan, Nurbek, et al.
Published: (2025)
Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis
by: Luo, Ruichen, et al.
Published: (2025)
by: Luo, Ruichen, et al.
Published: (2025)
Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity
by: Gorbunov, Eduard, et al.
Published: (2024)
by: Gorbunov, Eduard, et al.
Published: (2024)
Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization
by: Demidovich, Yury, et al.
Published: (2024)
by: Demidovich, Yury, et al.
Published: (2024)
Enhancing Stability of Physics-Informed Neural Network Training Through Saddle-Point Reformulation
by: Bylinkin, Dmitry, et al.
Published: (2025)
by: Bylinkin, Dmitry, et al.
Published: (2025)
Optimal Data Splitting in Distributed Optimization for Machine Learning
by: Medyakov, Daniil, et al.
Published: (2024)
by: Medyakov, Daniil, et al.
Published: (2024)
Gradient-Free Approaches is a Key to an Efficient Interaction with Markovian Stochasticity
by: Prokhorov, Boris, et al.
Published: (2026)
by: Prokhorov, Boris, et al.
Published: (2026)
Similar Items
-
LionMuon: Alternating Spectral and Sign Descent for Efficient Training
by: Bolatov, Arman, et al.
Published: (2026) -
Random-reshuffled SARAH does not need a full gradient computations
by: Beznosikov, Aleksandr, et al.
Published: (2021) -
Label Privacy in Split Learning for Large Models with Parameter-Efficient Training
by: Zmushko, Philip, et al.
Published: (2024) -
Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning
by: Semenov, Andrei, et al.
Published: (2024) -
Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods
by: Veprikov, Andrey, et al.
Published: (2025)