Saved in:
| Main Authors: | Chen, Yuchen, Shen, Yandi |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.23527 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Gradient descent for deep equilibrium single-index models
by: Dandapanthula, Sanjit, et al.
Published: (2025)
by: Dandapanthula, Sanjit, et al.
Published: (2025)
Learning single-index models via harmonic decomposition
by: Joshi, Nirmit, et al.
Published: (2025)
by: Joshi, Nirmit, et al.
Published: (2025)
Spike-timing-dependent Hebbian learning as noisy gradient descent
by: Dexheimer, Niklas, et al.
Published: (2025)
by: Dexheimer, Niklas, et al.
Published: (2025)
On the rate of convergence of an over-parametrized Transformer classifier learned by gradient descent
by: Kohler, Michael, et al.
Published: (2023)
by: Kohler, Michael, et al.
Published: (2023)
One-step corrected projected stochastic gradient descent for statistical estimation
by: Brouste, Alexandre, et al.
Published: (2023)
by: Brouste, Alexandre, et al.
Published: (2023)
Universality of high-dimensional scaling limits of stochastic gradient descent
by: Gheissari, Reza, et al.
Published: (2025)
by: Gheissari, Reza, et al.
Published: (2025)
Long-time dynamics and universality of nonconvex gradient descent
by: Han, Qiyang
Published: (2025)
by: Han, Qiyang
Published: (2025)
Convergence of flow-based generative models via proximal gradient descent in Wasserstein space
by: Cheng, Xiuyuan, et al.
Published: (2023)
by: Cheng, Xiuyuan, et al.
Published: (2023)
Observable adjustments in single-index models for regularized M-estimators
by: Bellec, Pierre C
Published: (2022)
by: Bellec, Pierre C
Published: (2022)
Uncertainty quantification by block bootstrap for differentially private stochastic gradient descent
by: Dette, Holger, et al.
Published: (2024)
by: Dette, Holger, et al.
Published: (2024)
Towards a unified framework for guided diffusion models
by: Jiao, Yuchen, et al.
Published: (2025)
by: Jiao, Yuchen, et al.
Published: (2025)
Precise gradient descent training dynamics for finite-width multi-layer neural networks
by: Han, Qiyang, et al.
Published: (2025)
by: Han, Qiyang, et al.
Published: (2025)
Training thermodynamic computers by gradient descent
by: Whitelam, Stephen
Published: (2025)
by: Whitelam, Stephen
Published: (2025)
Empirical Bayes estimation: When does $g$-modeling beat $f$-modeling in theory (and in practice)?
by: Shen, Yandi, et al.
Published: (2022)
by: Shen, Yandi, et al.
Published: (2022)
Fast Spawn\&Prune (FS\&P): Global convergence of stochastic conic particle gradient descent via birth/death process
by: De Castro, Yohann, et al.
Published: (2026)
by: De Castro, Yohann, et al.
Published: (2026)
Mallows-type model averaging: Non-asymptotic analysis and all-subset combination
by: Peng, Jingfu
Published: (2025)
by: Peng, Jingfu
Published: (2025)
Concentration of a sparse Bayesian model with Horseshoe prior in estimating high-dimensional precision matrix
by: Mai, The Tien
Published: (2024)
by: Mai, The Tien
Published: (2024)
Transformers Meet In-Context Learning: A Universal Approximation Theory
by: Li, Gen, et al.
Published: (2025)
by: Li, Gen, et al.
Published: (2025)
When Does Model Collapse Occur in Structured Interactive Learning?
by: Wu, Yuchen, et al.
Published: (2026)
by: Wu, Yuchen, et al.
Published: (2026)
Optimal Convergence Analysis of DDPM for General Distributions
by: Jiao, Yuchen, et al.
Published: (2025)
by: Jiao, Yuchen, et al.
Published: (2025)
Gradient descent inference in empirical risk minimization
by: Han, Qiyang, et al.
Published: (2024)
by: Han, Qiyang, et al.
Published: (2024)
eGAD! double descent is explained by Generalized Aliasing Decomposition
by: Transtrum, Mark K., et al.
Published: (2024)
by: Transtrum, Mark K., et al.
Published: (2024)
Deflated HeteroPCA: Overcoming the curse of ill-conditioning in heteroskedastic PCA
by: Zhou, Yuchen, et al.
Published: (2023)
by: Zhou, Yuchen, et al.
Published: (2023)
Average Gradient Outer Product in kernel regression provably recovers the central subspace for multi-index models
by: Zhu, Libin, et al.
Published: (2026)
by: Zhu, Libin, et al.
Published: (2026)
Provable Efficiency of Guidance in Diffusion Models for General Data Distribution
by: Li, Gen, et al.
Published: (2025)
by: Li, Gen, et al.
Published: (2025)
Stochastic gradient descent in high dimensions for multi-spiked tensor PCA
by: Arous, Gérard Ben, et al.
Published: (2024)
by: Arous, Gérard Ben, et al.
Published: (2024)
Minimax optimal submatrix detection: Sharp non-asymptotic rates
by: Knight, Parker, et al.
Published: (2026)
by: Knight, Parker, et al.
Published: (2026)
Sharp asymptotic theory for Q-learning with LDTZ learning rate and its generalization
by: Bonnerjee, Soham, et al.
Published: (2026)
by: Bonnerjee, Soham, et al.
Published: (2026)
Sliced gradient-enhanced Kriging for high-dimensional function approximation
by: Cheng, Kai, et al.
Published: (2022)
by: Cheng, Kai, et al.
Published: (2022)
Non-asymptotic error bounds for probability flow ODEs under weak log-concavity
by: Kremling, Gitte, et al.
Published: (2025)
by: Kremling, Gitte, et al.
Published: (2025)
Non-asymptotic confidence regions on RKHS. The Paley-Wiener and standard Sobolev space cases
by: Gamboa, Fabrice, et al.
Published: (2025)
by: Gamboa, Fabrice, et al.
Published: (2025)
Failures and Successes of Cross-Validation for Early-Stopped Gradient Descent
by: Patil, Pratik, et al.
Published: (2024)
by: Patil, Pratik, et al.
Published: (2024)
Time-uniform central limit theory and asymptotic confidence sequences
by: Waudby-Smith, Ian, et al.
Published: (2021)
by: Waudby-Smith, Ian, et al.
Published: (2021)
Contraction rates for conjugate gradient and Lanczos approximate posteriors in Gaussian process regression
by: Stankewitz, Bernhard, et al.
Published: (2024)
by: Stankewitz, Bernhard, et al.
Published: (2024)
Move on Muon : A Hamiltonian probability gradient flow perspective of Muon optimizer
by: Mustafi, Aratrika, et al.
Published: (2026)
by: Mustafi, Aratrika, et al.
Published: (2026)
Local geometry of high-dimensional mixture models: Effective spectral theory and dynamical transitions
by: Arous, Gerard Ben, et al.
Published: (2025)
by: Arous, Gerard Ben, et al.
Published: (2025)
A non-asymptotic theory of Kernel Ridge Regression: deterministic equivalents, test error, and GCV estimator
by: Misiakiewicz, Theodor, et al.
Published: (2024)
by: Misiakiewicz, Theodor, et al.
Published: (2024)
Optimal transport natural gradient for statistical manifolds with continuous sample space
by: Chen, Yifan, et al.
Published: (2018)
by: Chen, Yifan, et al.
Published: (2018)
Connections between reinforcement learning with feedback,test-time scaling, and diffusion guidance: An anthology
by: Jiao, Yuchen, et al.
Published: (2025)
by: Jiao, Yuchen, et al.
Published: (2025)
Are First-Order Diffusion Samplers Really Slower? A Fast Forward-Value Approach
by: Jiao, Yuchen, et al.
Published: (2025)
by: Jiao, Yuchen, et al.
Published: (2025)
Similar Items
-
Gradient descent for deep equilibrium single-index models
by: Dandapanthula, Sanjit, et al.
Published: (2025) -
Learning single-index models via harmonic decomposition
by: Joshi, Nirmit, et al.
Published: (2025) -
Spike-timing-dependent Hebbian learning as noisy gradient descent
by: Dexheimer, Niklas, et al.
Published: (2025) -
On the rate of convergence of an over-parametrized Transformer classifier learned by gradient descent
by: Kohler, Michael, et al.
Published: (2023) -
One-step corrected projected stochastic gradient descent for statistical estimation
by: Brouste, Alexandre, et al.
Published: (2023)