:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Tiapkin, Daniil, Calandriello, Daniele, Belomestny, Denis, Moulines, Eric, Naumov, Alexey, Rasul, Kashif, Valko, Michal, Menard, Pierre
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2505.19731
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Demonstration-Regularized RL
by: Tiapkin, Daniil, et al.
Published: (2023)

Model-free Posterior Sampling via Learning Rate Randomization
by: Tiapkin, Daniil, et al.
Published: (2023)

A New Bound on the Cumulant Generating Function of Dirichlet Processes
by: Perrault, Pierre, et al.
Published: (2024)

Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability
by: Samsonov, Sergey, et al.
Published: (2023)

Rates of convergence for density estimation with generative adversarial networks
by: Puchkin, Nikita, et al.
Published: (2021)

Statistical analysis of Inverse Entropy-regularized Reinforcement Learning
by: Belomestny, Denis, et al.
Published: (2025)

Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation
by: Sheshukova, Marina, et al.
Published: (2024)

Large-scale semi-supervised learning with online spectral graph sparsification
by: Calandriello, Daniele, et al.
Published: (2026)

Analysis of Nystrom method with sequential ridge leverage scores
by: Calandriello, Daniele, et al.
Published: (2026)

Pack only the essentials: Adaptive dictionary learning for kernel ridge regression
by: Calandriello, Daniele, et al.
Published: (2026)

Generative Flow Networks as Entropy-Regularized RL
by: Tiapkin, Daniil, et al.
Published: (2023)

Refined Analysis of Entropy-Regularized Actor-Critic
by: Labbi, Safwan, et al.
Published: (2026)

On Global Convergence Rates for Federated Softmax Policy Gradient under Heterogeneous Environments
by: Labbi, Safwan, et al.
Published: (2025)

Beyond Softmax and Entropy: Convergence Rates of Policy Gradients with f-SoftArgmax Parameterization & Coupled Regularization
by: Labbi, Safwan, et al.
Published: (2026)

UVIP: Model-Free Approach to Evaluate Reinforcement Learning Algorithms
by: Belomestny, Denis, et al.
Published: (2021)

Tight Bounds for Schrödinger Potential Estimation in Unpaired Data Translation
by: Puchkin, Nikita, et al.
Published: (2025)

Improved large-scale graph learning through ridge spectral sparsification
by: Calandriello, Daniele, et al.
Published: (2026)

Nash Learning from Human Feedback
by: Munos, Rémi, et al.
Published: (2023)

Schrödinger bridge problem via empirical risk minimization
by: Belomestny, Denis, et al.
Published: (2026)

Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents
by: Labbi, Safwan, et al.
Published: (2024)

Improving GFlowNets with Monte Carlo Tree Search
by: Morozov, Nikita, et al.
Published: (2024)

Finite-Sample Convergence Bounds for Trust Region Policy Optimization in Mean-Field Games
by: Ocello, Antonio, et al.
Published: (2025)

The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback
by: Fiegel, Côme, et al.
Published: (2026)

Optimal Design for Reward Modeling in RLHF
by: Scheid, Antoine, et al.
Published: (2024)

Theoretical guarantees for neural control variates in MCMC
by: Belomestny, Denis, et al.
Published: (2023)

Gaussian Approximation and Multiplier Bootstrap for Stochastic Gradient Descent
by: Sheshukova, Marina, et al.
Published: (2025)

On Teacher Hacking in Language Model Distillation
by: Tiapkin, Daniil, et al.
Published: (2025)

A note on concentration inequalities for the overlapped batch mean variance estimators for Markov chains
by: Moulines, Eric, et al.
Published: (2025)

Sample complexity of Schrödinger potential estimation
by: Puchkin, Nikita, et al.
Published: (2025)

A single algorithm for both restless and rested rotting bandits
by: Seznec, Julien, et al.
Published: (2026)

On Gaussian approximation for entropy-regularized Q-learning with function approximation
by: Rubtsov, Artemy, et al.
Published: (2026)

Adaptive Destruction Processes for Diffusion Samplers
by: Gritsaev, Timofei, et al.
Published: (2025)

Optimal last-iterate convergence in matrix games with bandit feedback using the log-barrier
by: Fiegel, Come, et al.
Published: (2026)

Statistical inference for Linear Stochastic Approximation with Markovian Noise
by: Samsonov, Sergey, et al.
Published: (2025)

First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities
by: Beznosikov, Aleksandr, et al.
Published: (2023)

Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
by: Tiapkin, Daniil, et al.
Published: (2024)

Incentivized Learning in Principal-Agent Bandit Games
by: Scheid, Antoine, et al.
Published: (2024)

SCAFFLSA: Taming Heterogeneity in Federated Linear Stochastic Approximation and TD Learning
by: Mangold, Paul, et al.
Published: (2024)

Planning in entropy-regularized Markov decision processes and games
by: Grill, Jean-Bastien, et al.
Published: (2026)

Rosenthal-type inequalities for linear statistics of Markov chains
by: Durmus, Alain, et al.
Published: (2023)