Saved in:
| Main Authors: | Tiapkin, Daniil, Calandriello, Daniele, Belomestny, Denis, Moulines, Eric, Naumov, Alexey, Rasul, Kashif, Valko, Michal, Menard, Pierre |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.19731 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Demonstration-Regularized RL
by: Tiapkin, Daniil, et al.
Published: (2023)
by: Tiapkin, Daniil, et al.
Published: (2023)
Model-free Posterior Sampling via Learning Rate Randomization
by: Tiapkin, Daniil, et al.
Published: (2023)
by: Tiapkin, Daniil, et al.
Published: (2023)
A New Bound on the Cumulant Generating Function of Dirichlet Processes
by: Perrault, Pierre, et al.
Published: (2024)
by: Perrault, Pierre, et al.
Published: (2024)
Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability
by: Samsonov, Sergey, et al.
Published: (2023)
by: Samsonov, Sergey, et al.
Published: (2023)
Rates of convergence for density estimation with generative adversarial networks
by: Puchkin, Nikita, et al.
Published: (2021)
by: Puchkin, Nikita, et al.
Published: (2021)
Statistical analysis of Inverse Entropy-regularized Reinforcement Learning
by: Belomestny, Denis, et al.
Published: (2025)
by: Belomestny, Denis, et al.
Published: (2025)
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation
by: Sheshukova, Marina, et al.
Published: (2024)
by: Sheshukova, Marina, et al.
Published: (2024)
Large-scale semi-supervised learning with online spectral graph sparsification
by: Calandriello, Daniele, et al.
Published: (2026)
by: Calandriello, Daniele, et al.
Published: (2026)
Analysis of Nystrom method with sequential ridge leverage scores
by: Calandriello, Daniele, et al.
Published: (2026)
by: Calandriello, Daniele, et al.
Published: (2026)
Pack only the essentials: Adaptive dictionary learning for kernel ridge regression
by: Calandriello, Daniele, et al.
Published: (2026)
by: Calandriello, Daniele, et al.
Published: (2026)
Generative Flow Networks as Entropy-Regularized RL
by: Tiapkin, Daniil, et al.
Published: (2023)
by: Tiapkin, Daniil, et al.
Published: (2023)
Refined Analysis of Entropy-Regularized Actor-Critic
by: Labbi, Safwan, et al.
Published: (2026)
by: Labbi, Safwan, et al.
Published: (2026)
On Global Convergence Rates for Federated Softmax Policy Gradient under Heterogeneous Environments
by: Labbi, Safwan, et al.
Published: (2025)
by: Labbi, Safwan, et al.
Published: (2025)
Beyond Softmax and Entropy: Convergence Rates of Policy Gradients with f-SoftArgmax Parameterization & Coupled Regularization
by: Labbi, Safwan, et al.
Published: (2026)
by: Labbi, Safwan, et al.
Published: (2026)
UVIP: Model-Free Approach to Evaluate Reinforcement Learning Algorithms
by: Belomestny, Denis, et al.
Published: (2021)
by: Belomestny, Denis, et al.
Published: (2021)
Tight Bounds for Schrödinger Potential Estimation in Unpaired Data Translation
by: Puchkin, Nikita, et al.
Published: (2025)
by: Puchkin, Nikita, et al.
Published: (2025)
Improved large-scale graph learning through ridge spectral sparsification
by: Calandriello, Daniele, et al.
Published: (2026)
by: Calandriello, Daniele, et al.
Published: (2026)
Nash Learning from Human Feedback
by: Munos, Rémi, et al.
Published: (2023)
by: Munos, Rémi, et al.
Published: (2023)
Schrödinger bridge problem via empirical risk minimization
by: Belomestny, Denis, et al.
Published: (2026)
by: Belomestny, Denis, et al.
Published: (2026)
Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents
by: Labbi, Safwan, et al.
Published: (2024)
by: Labbi, Safwan, et al.
Published: (2024)
Improving GFlowNets with Monte Carlo Tree Search
by: Morozov, Nikita, et al.
Published: (2024)
by: Morozov, Nikita, et al.
Published: (2024)
Finite-Sample Convergence Bounds for Trust Region Policy Optimization in Mean-Field Games
by: Ocello, Antonio, et al.
Published: (2025)
by: Ocello, Antonio, et al.
Published: (2025)
The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback
by: Fiegel, Côme, et al.
Published: (2026)
by: Fiegel, Côme, et al.
Published: (2026)
Optimal Design for Reward Modeling in RLHF
by: Scheid, Antoine, et al.
Published: (2024)
by: Scheid, Antoine, et al.
Published: (2024)
Theoretical guarantees for neural control variates in MCMC
by: Belomestny, Denis, et al.
Published: (2023)
by: Belomestny, Denis, et al.
Published: (2023)
Gaussian Approximation and Multiplier Bootstrap for Stochastic Gradient Descent
by: Sheshukova, Marina, et al.
Published: (2025)
by: Sheshukova, Marina, et al.
Published: (2025)
On Teacher Hacking in Language Model Distillation
by: Tiapkin, Daniil, et al.
Published: (2025)
by: Tiapkin, Daniil, et al.
Published: (2025)
A note on concentration inequalities for the overlapped batch mean variance estimators for Markov chains
by: Moulines, Eric, et al.
Published: (2025)
by: Moulines, Eric, et al.
Published: (2025)
Sample complexity of Schrödinger potential estimation
by: Puchkin, Nikita, et al.
Published: (2025)
by: Puchkin, Nikita, et al.
Published: (2025)
A single algorithm for both restless and rested rotting bandits
by: Seznec, Julien, et al.
Published: (2026)
by: Seznec, Julien, et al.
Published: (2026)
On Gaussian approximation for entropy-regularized Q-learning with function approximation
by: Rubtsov, Artemy, et al.
Published: (2026)
by: Rubtsov, Artemy, et al.
Published: (2026)
Adaptive Destruction Processes for Diffusion Samplers
by: Gritsaev, Timofei, et al.
Published: (2025)
by: Gritsaev, Timofei, et al.
Published: (2025)
Optimal last-iterate convergence in matrix games with bandit feedback using the log-barrier
by: Fiegel, Come, et al.
Published: (2026)
by: Fiegel, Come, et al.
Published: (2026)
Statistical inference for Linear Stochastic Approximation with Markovian Noise
by: Samsonov, Sergey, et al.
Published: (2025)
by: Samsonov, Sergey, et al.
Published: (2025)
First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities
by: Beznosikov, Aleksandr, et al.
Published: (2023)
by: Beznosikov, Aleksandr, et al.
Published: (2023)
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
by: Tiapkin, Daniil, et al.
Published: (2024)
by: Tiapkin, Daniil, et al.
Published: (2024)
Incentivized Learning in Principal-Agent Bandit Games
by: Scheid, Antoine, et al.
Published: (2024)
by: Scheid, Antoine, et al.
Published: (2024)
SCAFFLSA: Taming Heterogeneity in Federated Linear Stochastic Approximation and TD Learning
by: Mangold, Paul, et al.
Published: (2024)
by: Mangold, Paul, et al.
Published: (2024)
Planning in entropy-regularized Markov decision processes and games
by: Grill, Jean-Bastien, et al.
Published: (2026)
by: Grill, Jean-Bastien, et al.
Published: (2026)
Rosenthal-type inequalities for linear statistics of Markov chains
by: Durmus, Alain, et al.
Published: (2023)
by: Durmus, Alain, et al.
Published: (2023)
Similar Items
-
Demonstration-Regularized RL
by: Tiapkin, Daniil, et al.
Published: (2023) -
Model-free Posterior Sampling via Learning Rate Randomization
by: Tiapkin, Daniil, et al.
Published: (2023) -
A New Bound on the Cumulant Generating Function of Dirichlet Processes
by: Perrault, Pierre, et al.
Published: (2024) -
Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability
by: Samsonov, Sergey, et al.
Published: (2023) -
Rates of convergence for density estimation with generative adversarial networks
by: Puchkin, Nikita, et al.
Published: (2021)