:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Marcotte, Sibylle, Gribonval, Rémi, Peyré, Gabriel
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Optimization and Control
Online Access:	https://arxiv.org/abs/2405.12888
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Abide by the Law and Follow the Flow: Conservation Laws for Gradient Flows
by: Marcotte, Sibylle, et al.
Published: (2023)

Transformative or Conservative? Conservation laws for ResNets and Transformers
by: Marcotte, Sibylle, et al.
Published: (2025)

Intrinsic training dynamics of deep neural networks
by: Marcotte, Sibylle, et al.
Published: (2025)

Muon Dynamics as a Spectral Wasserstein Flow
by: Peyré, Gabriel
Published: (2026)

Path-conditioned training: a principled way to rescale ReLU neural networks
by: Lebeurrier, Arthur, et al.
Published: (2026)

Robust Sublinear Convergence Rates for Iterative Bregman Projections
by: Peyré, Gabriel
Published: (2026)

Optimal and Diffusion Transports in Machine Learning
by: Peyré, Gabriel
Published: (2025)

Optimal Transport for Machine Learners
by: Peyré, Gabriel
Published: (2025)

On the global convergence of gradient descent for wide shallow models with bounded nonlinearities
by: Petit, Romain, et al.
Published: (2026)

Understanding the training of infinitely deep and wide ResNets with Conditional Optimal Transport
by: Barboni, Raphaël, et al.
Published: (2024)

Ultra-fast feature learning for the training of two-layer neural networks in the two-timescale regime
by: Barboni, Raphaël, et al.
Published: (2025)

Shuffling Momentum Gradient Algorithm for Convex Optimization
by: Tran, Trang H., et al.
Published: (2024)

Non-Euclidean Gradient Descent Operates at the Edge of Stability
by: Islamov, Rustem, et al.
Published: (2026)

Adaptive Optimization via Momentum on Variance-Normalized Gradients
by: Patitucci, Francisco, et al.
Published: (2026)

Momentum Does Not Reduce Stochastic Noise in Stochastic Gradient Descent
by: Sato, Naoki, et al.
Published: (2024)

Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex Optimization
by: Liu, Wei, et al.
Published: (2025)

From Score Matching to Diffusion: A Fine-Grained Error Analysis in the Gaussian Setting
by: Hurault, Samuel, et al.
Published: (2025)

Training Infinitely Deep and Wide Transformers
by: Barboni, Raphaël, et al.
Published: (2026)

Policy Gradient with Second Order Momentum
by: Sun, Tianyu
Published: (2025)

Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance Reduction
by: Feng, Jie, et al.
Published: (2024)

Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults
by: Phunyaphibarn, Prin, et al.
Published: (2023)

First and Second Order Approximations to Stochastic Gradient Descent Methods with Momentum Terms
by: Lu, Eric
Published: (2025)

Understanding Gradient Orthogonalization for Deep Learning via Non-Euclidean Trust-Region Optimization
by: Kovalev, Dmitry
Published: (2025)

Adaptive Momentum and Nonlinear Damping for Neural Network Training
by: Karoni, Aikaterini, et al.
Published: (2026)

Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf's Law
by: Kunstner, Frederik, et al.
Published: (2025)

Neighbor-Sampling Based Momentum Stochastic Methods for Training Graph Neural Networks
by: Noel, Molly, et al.
Published: (2025)

GANs as Gradient Flows that Converge
by: Huang, Yu-Jui, et al.
Published: (2022)

Flowing Datasets with Wasserstein over Wasserstein Gradient Flows
by: Bonet, Clément, et al.
Published: (2025)

Algorithmic Stability of Stochastic Gradient Descent with Momentum under Heavy-Tailed Noise
by: Dang, Thanh, et al.
Published: (2025)

Safe Gradient Flow for Bilevel Optimization
by: Sharifi, Sina, et al.
Published: (2025)

WeightLoRA: Keep Only Necessary Adapters
by: Veprikov, Andrey, et al.
Published: (2025)

Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks
by: Beneventano, Pierfrancesco, et al.
Published: (2025)

Modified Loss of Momentum Gradient Descent: Fine-Grained Analysis
by: Cattaneo, Matias D., et al.
Published: (2025)

Improving Stochastic Cubic Newton with Momentum
by: Chayti, El Mahdi, et al.
Published: (2024)

Stochastic Difference-of-Convex Optimization with Momentum
by: Chayti, El Mahdi, et al.
Published: (2025)

Dimension-adapted Momentum Outscales SGD
by: Ferbach, Damien, et al.
Published: (2025)

Multi-Objective Optimization via Wasserstein-Fisher-Rao Gradient Flow
by: Ren, Yinuo, et al.
Published: (2023)

Gradient Flow Polarizes Softmax Outputs towards Low-Entropy Solutions
by: Varre, Aditya, et al.
Published: (2026)

Hessian-guided Perturbed Wasserstein Gradient Flows for Escaping Saddle Points
by: Yamamoto, Naoya, et al.
Published: (2025)

Grams: Gradient Descent with Adaptive Momentum Scaling
by: Cao, Yang, et al.
Published: (2024)