:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Tang, Wenpin, Zhou, Xun Yu
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Machine Learning Optimization and Control Probability
Online-Zugang:	https://arxiv.org/abs/2411.01302
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Fine-tuning of diffusion models via stochastic control: entropy regularization and beyond
von: Tang, Wenpin, et al.
Veröffentlicht: (2024)

Improved sampling via learned diffusions
von: Richter, Lorenz, et al.
Veröffentlicht: (2023)

Continuous-time q-learning for mean-field control with common noise, part-II: q-learning algorithms
von: Ren, Zhenjie, et al.
Veröffentlicht: (2026)

Representative Action Selection for Large Action Space: From Bandits to MDPs
von: Zhou, Quan, et al.
Veröffentlicht: (2025)

On Bellman equations for continuous-time policy evaluation I: discretization and approximation
von: Mou, Wenlong, et al.
Veröffentlicht: (2024)

Representative Action Selection for Large Action Space Bandit Families
von: Zhou, Quan, et al.
Veröffentlicht: (2025)

Statistical guarantees for continuous-time policy evaluation: blessing of ellipticity and new tradeoffs
von: Mou, Wenlong
Veröffentlicht: (2025)

Near-Optimal Regret-Queue Length Tradeoff in Online Learning for Two-Sided Markets
von: Yang, Zixian, et al.
Veröffentlicht: (2025)

ART for Diffusion Sampling: A Reinforcement Learning Approach to Timestep Schedule
von: Huang, Yilie, et al.
Veröffentlicht: (2026)

Large Deviation Upper Bounds and Improved MSE Rates of Nonlinear SGD: Heavy-tailed Noise and Power of Symmetry
von: Armacki, Aleksandar, et al.
Veröffentlicht: (2024)

Robust $Q$-learning Algorithm for Markov Decision Processes under Wasserstein Uncertainty
von: Neufeld, Ariel, et al.
Veröffentlicht: (2022)

Continuous-time q-learning for mean-field control with common noise, part-I: Theoretical foundations
von: Ren, Zhenjie, et al.
Veröffentlicht: (2026)

Sublinear Regret for a Class of Continuous-Time Linear-Quadratic Reinforcement Learning Problems
von: Huang, Yilie, et al.
Veröffentlicht: (2024)

Is RL fine-tuning harder than regression? A PDE learning approach for diffusion models
von: Mou, Wenlong
Veröffentlicht: (2025)

A Generalization Result for Convergence in Learning-to-Optimize
von: Sucker, Michael, et al.
Veröffentlicht: (2024)

Decoupled Functional Central Limit Theorems for Two-Time-Scale Stochastic Approximation
von: Han, Yuze, et al.
Veröffentlicht: (2024)

Asymptotic regularity of a generalised stochastic Halpern scheme
von: Pischke, Nicholas, et al.
Veröffentlicht: (2024)

Accelerating Distributed Stochastic Optimization via Self-Repellent Random Walks
von: Hu, Jie, et al.
Veröffentlicht: (2024)

Stochastic Inverse Problem: stability, regularization and Wasserstein gradient flow
von: Li, Qin, et al.
Veröffentlicht: (2024)

Weak Convergence Analysis of Online Neural Actor-Critic Algorithms
von: Lam, Samuel Chun-Hei, et al.
Veröffentlicht: (2024)

Learning-Based Pricing and Matching for Two-Sided Queues
von: Yang, Zixian, et al.
Veröffentlicht: (2024)

Data-driven Multistage Distributionally Robust Linear Optimization with Nested Distance
von: Gao, Rui, et al.
Veröffentlicht: (2024)

On the SAGA algorithm with decreasing step
von: Fredes, Luis, et al.
Veröffentlicht: (2024)

The generator gradient estimator is an adjoint state method for stochastic differential equations
von: Badolle, Quentin, et al.
Veröffentlicht: (2024)

Convergence rates for the Adam optimizer
von: Dereich, Steffen, et al.
Veröffentlicht: (2024)

Linear convergence of proximal descent schemes on the Wasserstein space
von: Lascu, Razvan-Andrei, et al.
Veröffentlicht: (2024)

Prelimit Coupling and Steady-State Convergence of Constant-stepsize Nonsmooth Contractive SA
von: Zhang, Yixuan, et al.
Veröffentlicht: (2024)

Model Predictive Control is Almost Optimal for Restless Bandit
von: Gast, Nicolas, et al.
Veröffentlicht: (2024)

Mirror Descent-Ascent for mean-field min-max problems
von: Lascu, Razvan-Andrei, et al.
Veröffentlicht: (2024)

A Fisher-Rao gradient flow for entropic mean-field min-max games
von: Lascu, Razvan-Andrei, et al.
Veröffentlicht: (2024)

Structure Matters: Dynamic Policy Gradient
von: Klein, Sara, et al.
Veröffentlicht: (2024)

Model Predictive Control is almost Optimal for Heterogeneous Restless Multi-armed Bandits
von: Narasimha, Dheeraj, et al.
Veröffentlicht: (2025)

Properties of Discrete Sliced Wasserstein Losses
von: Tanguy, Eloi, et al.
Veröffentlicht: (2023)

Non-convex entropic mean-field optimization via Best Response flow
von: Lascu, Razvan-Andrei, et al.
Veröffentlicht: (2025)

On propagation of chaos for the Fisher-Rao gradient flow in entropic mean-field optimization
von: Lazić, Petra, et al.
Veröffentlicht: (2026)

Value Mirror Descent for Reinforcement Learning
von: Jia, Zhichao, et al.
Veröffentlicht: (2026)

Function approximation by neural nets in the mean-field regime: Entropic regularization and controlled McKean-Vlasov dynamics
von: Tzen, Belinda, et al.
Veröffentlicht: (2020)

ODE approximation for the Adam algorithm: General and overparametrized setting
von: Dereich, Steffen, et al.
Veröffentlicht: (2025)

Generalized Wasserstein Flow Matching: Transport Plans, Everywhere, All at Once
von: Piening, Moritz, et al.
Veröffentlicht: (2026)

Concentration of General Stochastic Approximation Under Heavy-Tailed Markovian Noise
von: Agrawal, Shubhada, et al.
Veröffentlicht: (2026)