:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yu, Jianneng, Morozov, Alexandre V.
Format:	Preprint
Published:	2026
Subjects:	Machine Learning 68T07 G.1.6
Online Access:	https://arxiv.org/abs/2602.21276
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Stochastic Estimation of the Layer-wise Hessian Trace for Monitoring Neural-network Training
by: Bolshim, Maxim, et al.
Published: (2026)

On the Convergence Behavior of Preconditioned Gradient Descent Toward the Rich Learning Regime
by: Jiang, Shuai, et al.
Published: (2026)

Data-induced multiscale losses and efficient multirate gradient descent schemes
by: He, Juncai, et al.
Published: (2024)

Local properties of neural networks through the lens of layer-wise Hessians
by: Bolshim, Maxim, et al.
Published: (2025)

Inter-Layer Hessian Analysis of Neural Networks with DAG Architectures
by: Bolshim, Maxim, et al.
Published: (2026)

Gradient descent provably escapes saddle points in the training of shallow ReLU networks
by: Cheridito, Patrick, et al.
Published: (2022)

Kourkoutas-Beta: A Sunspike-Driven Adam Optimizer with Desert Flair
by: Kassinos, Stavros C.
Published: (2025)

Non-convergence to the optimal risk for Adam and stochastic gradient descent optimization in the training of deep neural networks
by: Do, Thang, et al.
Published: (2025)

EB-gMCR: Energy-Based Generative Modeling for Signal Unmixing and Multivariate Curve Resolution
by: Chang, Yu-Tang, et al.
Published: (2025)

ZetA: A Riemann Zeta-Scaled Extension of Adam for Deep Learning
by: BC, Samiksha
Published: (2025)

GDNSQ: Gradual Differentiable Noise Scale Quantization for Low-bit Neural Networks
by: Salishev, Sergey, et al.
Published: (2025)

Diagnosing Failure Modes of Neural Operators Across Diverse PDE Families
by: Shikhman, Lennon
Published: (2026)

Adam Improves Muon: Adaptive Moment Estimation with Orthogonalized Momentum
by: Zhang, Minxin, et al.
Published: (2026)

Context-dependent manifold learning: A neuromodulated constrained autoencoder approach
by: Adriaens, Jérôme, et al.
Published: (2026)

Ghosts of Softmax: Complex Singularities That Limit Safe Step Sizes in Cross-Entropy
by: Sao, Piyush
Published: (2026)

Total Generalized Variation regularization closes the gap between neural-eld and classical methods in seismic travel-time tomography
by: Kurosawa, Isao
Published: (2026)

Resolving gradient pathology in physics-informed epidemiological models
by: Golooba, Nickson, et al.
Published: (2026)

Physics Informed Differentiable Solvers for Learning Parametric Solution Manifolds in Heterogeneous Physical Systems
by: Panahi, Milad, et al.
Published: (2026)

Model-Free Local Recalibration of Neural Networks
by: Torres, R., et al.
Published: (2024)

Spiking Neural Networks for SAR Interferometric Phase Unwrapping: A Theoretical Framework for Energy-Efficient Processing
by: Bara, Marc
Published: (2025)

Differentiable Optimization Layers for Guaranteed Fairness in Deep Learning
by: Troxell, David, et al.
Published: (2026)

Deep Learning and Elicitability for McKean-Vlasov FBSDEs With Common Noise
by: Antunes, Felipe J. P., et al.
Published: (2025)

Neural Green's Operators for Parametric Partial Differential Equations
by: Melchers, Hugo, et al.
Published: (2024)

Manifold limit for the training of shallow graph convolutional neural networks
by: Tengler, Johanna, et al.
Published: (2026)

PyEPO: A PyTorch-based End-to-End Predict-then-Optimize Library for Linear and Integer Programming
by: Tang, Bo, et al.
Published: (2022)

Non-convergence to global minimizers in data driven supervised deep learning: Adam and stochastic gradient descent optimization provably fail to converge to global minimizers in the training of deep neural networks with ReLU activation
by: Do, Thang, et al.
Published: (2024)

CAO: Curvature-Adaptive Optimization via Periodic Low-Rank Hessian Sketching
by: Du, Wenzhang
Published: (2025)

Benchmarking Generative AI Against Bayesian Optimization for Constrained Multi-Objective Inverse Design
by: Awan, Muhammad Bilal, et al.
Published: (2025)

The Polar Express: Optimal Matrix Sign Methods and Their Application to the Muon Algorithm
by: Amsel, Noah, et al.
Published: (2025)

Online Federation For Mixtures of Proprietary Agents with Black-Box Encoders
by: Yang, Xuwei, et al.
Published: (2025)

Near-optimal estimates for the $\ell^p$-Lipschitz constants of deep random ReLU neural networks
by: Dirksen, Sjoerd, et al.
Published: (2025)

Refining Graphical Neural Network Predictions Using Flow Matching for Optimal Power Flow with Constraint-Satisfaction Guarantee
by: Khanal, Kshitiz
Published: (2025)

Active Learning for Conditional Generative Compressed Sensing
by: DeLise, Alexander, et al.
Published: (2026)

The Neural Differential Manifold: An Architecture with Explicit Geometric Structure
by: Zhang, Di
Published: (2025)

Mathematical Foundations of Neural Tangents and Infinite-Width Networks
by: Mysore, Rachana, et al.
Published: (2025)

Frequency Bias and OOD Generalization in Neural Operators under a Variable-Coefficient Wave Equation
by: Xie, Runlong, et al.
Published: (2026)

Primal-Dual Sample Complexity Bounds for Constrained Markov Decision Processes with Multiple Constraints
by: Buckley, Max, et al.
Published: (2025)

Physics-Informed Neural Networks for Optimal Vaccination Plan in SIR Epidemic Models
by: Kim, Minseok, et al.
Published: (2025)

SCAPE: Searching Conceptual Architecture Prompts using Evolution
by: Lim, Soo Ling, et al.
Published: (2024)

Towards Coordinate- and Dimension-Agnostic Machine Learning for Partial Differential Equations
by: Phan, Trung V., et al.
Published: (2025)