:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	He, Jianliang, Wang, Leda, Chen, Siyu, Yang, Zhuoran
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Optimization and Control
Online Access:	https://arxiv.org/abs/2602.16849
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers
by: Chen, Siyu, et al.
Published: (2024)

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic
by: Zhang, Yufeng, et al.
Published: (2021)

Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality
by: Chen, Siyu, et al.
Published: (2024)

TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training
by: Menezes, Michael, et al.
Published: (2025)

Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
by: Shen, Han, et al.
Published: (2024)

A Theoretical Framework for Grokking: Interpolation followed by Riemannian Norm Minimisation
by: Boursier, Etienne, et al.
Published: (2025)

Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency
by: Cai, Qi, et al.
Published: (2022)

Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems
by: Kim, Juno, et al.
Published: (2023)

Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
by: Zhang, Yufeng, et al.
Published: (2020)

A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization
by: Zhu, Yuchen, et al.
Published: (2024)

Provably Efficient Exploration in Policy Optimization
by: Cai, Qi, et al.
Published: (2019)

Bridging Lottery Ticket and Grokking: Understanding Grokking from Inner Structure of Networks
by: Minegishi, Gouki, et al.
Published: (2023)

Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach
by: Qiu, Shuang, et al.
Published: (2022)

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
by: Li, Zihao, et al.
Published: (2024)

Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization
by: Yang, Zhuoran, et al.
Published: (2020)

Grokking or Glitching? How Low-Precision Drives Slingshot Loss Spikes
by: Hanqing, Liu, et al.
Published: (2026)

Grokking as Structural Inference: Transformers Need Bayesian Lottery Tickets
by: Hidajat, Kai, et al.
Published: (2026)

Fourier Learning Machines: Nonharmonic Fourier-Based Neural Networks for Scientific Machine Learning
by: Rubel, Mominul, et al.
Published: (2025)

Modular Distributed Nonconvex Learning with Error Feedback
by: Carnevale, Guido, et al.
Published: (2025)

Feature Augmentation of GNNs for ILPs: Local Uniqueness Suffices
by: Han, Qingyu, et al.
Published: (2025)

Guided by the Experts: Provable Feature Learning Dynamic of Soft-Routed Mixture-of-Experts
by: Liao, Fangshuo, et al.
Published: (2025)

Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity
by: Amortila, Philip, et al.
Published: (2024)

Dynamic Controlled Variables Based Dynamic Self-Optimizing Control
by: Zhou, Chenchen, et al.
Published: (2026)

Which Features are Best for Successor Features?
by: Ollivier, Yann
Published: (2025)

A Modular Algorithm for Non-Stationary Online Convex-Concave Optimization
by: Meng, Qing-xin, et al.
Published: (2025)

Implicit Regularization of Gradient Flow on One-Layer Softmax Attention
by: Sheen, Heejune, et al.
Published: (2024)

Shallow Neural Networks Learn Low-Degree Spherical Polynomials with Feature Learning by Learnable Channel Attention
by: Yang, Yingzhen
Published: (2025)

Decision-Dependent Stochastic Optimization: The Role of Distribution Dynamics
by: He, Zhiyu, et al.
Published: (2025)

Enhancing Unsupervised Feature Selection via Double Sparsity Constrained Optimization
by: Xiu, Xianchao, et al.
Published: (2025)

A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints
by: Chen, Xi, et al.
Published: (2024)

A Theory of Feature Learning in Kernel Models
by: Chen, Yunlu, et al.
Published: (2023)

A Mechanism Study of Delayed Loss Spikes in Batch-Normalized Linear Models
by: Gao, Peifeng, et al.
Published: (2026)

Multi-level Monte-Carlo Gradient Methods for Stochastic Optimization with Biased Oracles
by: Hu, Yifan, et al.
Published: (2024)

Bi-Sparse Unsupervised Feature Selection
by: Xiu, Xianchao, et al.
Published: (2024)

Feature-Based Interpretable Surrogates for Optimization
by: Goerigk, Marc, et al.
Published: (2024)

Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning
by: Caron, Francois, et al.
Published: (2023)

Negative Imaginary Neural ODEs: Learning to Control Mechanical Systems with Stability Guarantees
by: Shi, Kanghong, et al.
Published: (2025)

Random Features Approximation for Control-Affine Systems
by: Kazemian, Kimia, et al.
Published: (2024)

A Compositional Kernel Model for Feature Learning
by: Ruan, Feng, et al.
Published: (2025)

Supervised Feature Compression based on Counterfactual Analysis
by: Piccialli, Veronica, et al.
Published: (2022)