:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zmushko, Philip, Beznosikov, Aleksandr, Takáč, Martin, Horváth, Samuel
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2411.07837
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

LionMuon: Alternating Spectral and Sign Descent for Efficient Training
by: Bolatov, Arman, et al.
Published: (2026)

Random-reshuffled SARAH does not need a full gradient computations
by: Beznosikov, Aleksandr, et al.
Published: (2021)

Label Privacy in Split Learning for Large Models with Parameter-Efficient Training
by: Zmushko, Philip, et al.
Published: (2024)

Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning
by: Semenov, Andrei, et al.
Published: (2024)

Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods
by: Veprikov, Andrey, et al.
Published: (2025)

Faster Than SVD, Smarter Than SGD: The OPLoRA Alternating Update
by: Almansoori, Abdulla Jasem, et al.
Published: (2025)

Beyond SGD, Without SVD: Proximal Subspace Iteration LoRA with Diagonal Fractional K-FAC
by: Almansoori, Abdulla Jasem, et al.
Published: (2026)

Similarity, Compression and Local Steps: Three Pillars of Efficient Communications for Distributed Variational Inequalities
by: Beznosikov, Aleksandr, et al.
Published: (2023)

Sign-SGD via Parameter-Free Optimization
by: Medyakov, Daniil, et al.
Published: (2025)

Convergence of Clipped-SGD for Convex $(L_0,L_1)$-Smooth Optimization with Heavy-Tailed Noise
by: Chezhegov, Savelii, et al.
Published: (2025)

Where Does Warm-Up Come From? Adaptive Scheduling for Norm-Constrained Optimizers
by: Riabinin, Artem, et al.
Published: (2026)

Clipping Improves Adam-Norm and AdaGrad-Norm when the Noise Is Heavy-Tailed
by: Chezhegov, Savelii, et al.
Published: (2024)

Collaborative and Efficient Personalization with Mixtures of Adaptors
by: Almansoori, Abdulla Jasem, et al.
Published: (2024)

AdaFRUGAL: Adaptive Memory-Efficient Training with Dynamic Control
by: Bui, Quang-Hung, et al.
Published: (2025)

On Biased Compression for Distributed Learning
by: Beznosikov, Aleksandr, et al.
Published: (2020)

Byzantine-Robust Optimization under $(L_0, L_1)$-Smoothness
by: Bolatov, Arman, et al.
Published: (2026)

Sign Operator for Coping with Heavy-Tailed Noise in Non-Convex Optimization: High Probability Bounds Under $(L_0, L_1)$-Smoothness
by: Kornilov, Nikita, et al.
Published: (2025)

Thinking like a CHEMIST: Combined Heterogeneous Embedding Model Integrating Structure and Tokens
by: Rekut, Nikolai, et al.
Published: (2025)

Generalising Battery Control in Net-Zero Buildings via Personalised Federated RL
by: Avila, Nicolas M Cuadrado, et al.
Published: (2024)

Ito Diffusion Approximation of Universal Ito Chains for Sampling, Optimization and Boosting
by: Ustimenko, Aleksei, et al.
Published: (2023)

PaDPaF: Partial Disentanglement with Partially-Federated GANs
by: Almansoori, Abdulla Jasem, et al.
Published: (2022)

Stochastic Gradient Methods with Preconditioned Updates
by: Sadiev, Abdurakhmon, et al.
Published: (2022)

Federated Learning Can Find Friends That Are Advantageous
by: Tupitsa, Nazarii, et al.
Published: (2024)

Accelerated Stochastic ExtraGradient: Mixing Hessian and Gradient Similarity to Reduce Communication in Distributed and Federated Learning
by: Bylinkin, Dmitry, et al.
Published: (2024)

Generalized Policy Learning for Smart Grids: FL TRPO Approach
by: Li, Yunxiang, et al.
Published: (2024)

Accelerated Methods with Compressed Communications for Distributed Optimization Problems under Data Similarity
by: Bylinkin, Dmitry, et al.
Published: (2024)

FedPeWS: Personalized Warmup via Subnetworks for Enhanced Heterogeneous Federated Learning
by: Tastan, Nurbek, et al.
Published: (2024)

What Scalable Second-Order Information Knows for Pruning at Initialization
by: Navarrete, Ivo Gollini, et al.
Published: (2025)

Efficient Conformal Prediction under Data Heterogeneity
by: Plassier, Vincent, et al.
Published: (2023)

Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features
by: Beznosikov, Aleksandr, et al.
Published: (2023)

Hierarchical Mixture-of-Experts with Two-Stage Optimization
by: Molodtsov, Gleb, et al.
Published: (2026)

Decentralized Personalized Federated Learning for Min-Max Problems
by: Borodich, Ekaterina, et al.
Published: (2021)

Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
by: Choudhury, Sayantan, et al.
Published: (2024)

LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning
by: Tastan, Nurbek, et al.
Published: (2025)

Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis
by: Luo, Ruichen, et al.
Published: (2025)

Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity
by: Gorbunov, Eduard, et al.
Published: (2024)

Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization
by: Demidovich, Yury, et al.
Published: (2024)

Enhancing Stability of Physics-Informed Neural Network Training Through Saddle-Point Reformulation
by: Bylinkin, Dmitry, et al.
Published: (2025)

Optimal Data Splitting in Distributed Optimization for Machine Learning
by: Medyakov, Daniil, et al.
Published: (2024)

Gradient-Free Approaches is a Key to an Efficient Interaction with Markovian Stochasticity
by: Prokhorov, Boris, et al.
Published: (2026)