Saved in:
| Main Authors: | Weissmann, Simon, Freihaut, Till, Vernade, Claire, Ramponi, Giorgia, Döring, Leif |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.02735 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On Feasible Rewards in Multi-Agent Inverse Reinforcement Learning
by: Freihaut, Till, et al.
Published: (2024)
by: Freihaut, Till, et al.
Published: (2024)
Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning
by: Freihaut, Till, et al.
Published: (2025)
by: Freihaut, Till, et al.
Published: (2025)
Multi-agent imitation learning with function approximation: Linear Markov games and beyond
by: Viano, Luca, et al.
Published: (2026)
by: Viano, Luca, et al.
Published: (2026)
Rate optimal learning of equilibria from data
by: Freihaut, Till, et al.
Published: (2025)
by: Freihaut, Till, et al.
Published: (2025)
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods
by: Klein, Sara, et al.
Published: (2023)
by: Klein, Sara, et al.
Published: (2023)
Controlling the Flow: Stability and Convergence for Stochastic Gradient Descent with Decaying Regularization
by: Kassing, Sebastian, et al.
Published: (2025)
by: Kassing, Sebastian, et al.
Published: (2025)
Almost sure convergence rates of stochastic gradient methods under gradient domination
by: Weissmann, Simon, et al.
Published: (2024)
by: Weissmann, Simon, et al.
Published: (2024)
The Role of Target Update Frequencies in Q-Learning
by: Weissmann, Simon, et al.
Published: (2026)
by: Weissmann, Simon, et al.
Published: (2026)
Structure Matters: Dynamic Policy Gradient
by: Klein, Sara, et al.
Published: (2024)
by: Klein, Sara, et al.
Published: (2024)
Tight Sample Complexity Bounds for Entropic Best Policy Identification
by: Essakine, Amer, et al.
Published: (2026)
by: Essakine, Amer, et al.
Published: (2026)
Fine-tuning Behavioral Cloning Policies with Preference-Based Reinforcement Learning
by: Macuglia, Maël, et al.
Published: (2025)
by: Macuglia, Maël, et al.
Published: (2025)
Partially Observable Reinforcement Learning with Memory Traces
by: Eberhard, Onno, et al.
Published: (2025)
by: Eberhard, Onno, et al.
Published: (2025)
Non-Stationary Lipschitz Bandits
by: Nguyen, Nicolas, et al.
Published: (2025)
by: Nguyen, Nicolas, et al.
Published: (2025)
Commit to the Bit: Reactive Reinforcement Learning Done Right
by: Eberhard, Onno, et al.
Published: (2026)
by: Eberhard, Onno, et al.
Published: (2026)
Variational Bayes Portfolio Construction
by: Nguyen, Nicolas, et al.
Published: (2024)
by: Nguyen, Nicolas, et al.
Published: (2024)
Random Function Descent
by: Benning, Felix, et al.
Published: (2023)
by: Benning, Felix, et al.
Published: (2023)
A Pontryagin Perspective on Reinforcement Learning
by: Eberhard, Onno, et al.
Published: (2024)
by: Eberhard, Onno, et al.
Published: (2024)
Put CASH on Bandits: A Max K-Armed Problem for Automated Machine Learning
by: Balef, Amir Rezaei, et al.
Published: (2025)
by: Balef, Amir Rezaei, et al.
Published: (2025)
Preference Elicitation for Offline Reinforcement Learning
by: Pace, Alizée, et al.
Published: (2024)
by: Pace, Alizée, et al.
Published: (2024)
Split the Differences, Pool the Rest: Provably Efficient Multi-Objective Imitation
by: Sheebaelhamd, Ziyad, et al.
Published: (2026)
by: Sheebaelhamd, Ziyad, et al.
Published: (2026)
Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits
by: Nguyen, Nicolas, et al.
Published: (2024)
by: Nguyen, Nicolas, et al.
Published: (2024)
Online Decision Deferral under Budget Constraints
by: Reid, Mirabel, et al.
Published: (2024)
by: Reid, Mirabel, et al.
Published: (2024)
Quantization-Free Autoregressive Action Transformer
by: Sheebaelhamd, Ziyad, et al.
Published: (2025)
by: Sheebaelhamd, Ziyad, et al.
Published: (2025)
Truly No-Regret Learning in Constrained MDPs
by: Müller, Adrian, et al.
Published: (2024)
by: Müller, Adrian, et al.
Published: (2024)
Gradient Span Algorithms Make Predictable Progress in High Dimension
by: Benning, Felix, et al.
Published: (2024)
by: Benning, Felix, et al.
Published: (2024)
Exploiting Causal Graph Priors with Posterior Sampling for Reinforcement Learning
by: Mutti, Mirco, et al.
Published: (2023)
by: Mutti, Mirco, et al.
Published: (2023)
On the Convergence of Single-Timescale Actor-Critic
by: Kumar, Navdeep, et al.
Published: (2024)
by: Kumar, Navdeep, et al.
Published: (2024)
Efficient Risk-sensitive Planning via Entropic Risk Measures
by: Marthe, Alexandre, et al.
Published: (2025)
by: Marthe, Alexandre, et al.
Published: (2025)
Contextual Bilevel Reinforcement Learning for Incentive Alignment
by: Thoma, Vinzenz, et al.
Published: (2024)
by: Thoma, Vinzenz, et al.
Published: (2024)
Learning Acrobatic Flight from Preferences
by: Merk, Colin, et al.
Published: (2025)
by: Merk, Colin, et al.
Published: (2025)
An Approximate Ascent Approach To Prove Convergence of PPO
by: Doering, Leif, et al.
Published: (2026)
by: Doering, Leif, et al.
Published: (2026)
Policy Gradient with Tree Search: Avoiding Local Optimas through Lookahead
by: Koren, Uri, et al.
Published: (2025)
by: Koren, Uri, et al.
Published: (2025)
ADDQ: Adaptive Distributional Double Q-Learning
by: Döring, Leif, et al.
Published: (2025)
by: Döring, Leif, et al.
Published: (2025)
Optimal Sample Complexity for Single Time-Scale Actor-Critic with Momentum
by: Kumar, Navdeep, et al.
Published: (2026)
by: Kumar, Navdeep, et al.
Published: (2026)
An in depth look at the Procrustes-Wasserstein distance: properties and barycenters
by: Adamo, Davide, et al.
Published: (2025)
by: Adamo, Davide, et al.
Published: (2025)
Dual Formulation for Non-Rectangular Lp Robust Markov Decision Processes
by: Kumar, Navdeep, et al.
Published: (2025)
by: Kumar, Navdeep, et al.
Published: (2025)
MAVRL: Learning Reward Functions from Multiple Feedback Types with Amortized Variational Inference
by: Baur, Raphaël, et al.
Published: (2026)
by: Baur, Raphaël, et al.
Published: (2026)
Aligning Language Models from User Interactions
by: Buening, Thomas Kleine, et al.
Published: (2026)
by: Buening, Thomas Kleine, et al.
Published: (2026)
Adaptive Kernel Selection for Stein Variational Gradient Descent
by: Melcher, Moritz, et al.
Published: (2025)
by: Melcher, Moritz, et al.
Published: (2025)
Beyond R-barycenters: an effective averaging method on Stiefel and Grassmann manifolds
by: Bouchard, Florent, et al.
Published: (2025)
by: Bouchard, Florent, et al.
Published: (2025)
Similar Items
-
On Feasible Rewards in Multi-Agent Inverse Reinforcement Learning
by: Freihaut, Till, et al.
Published: (2024) -
Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning
by: Freihaut, Till, et al.
Published: (2025) -
Multi-agent imitation learning with function approximation: Linear Markov games and beyond
by: Viano, Luca, et al.
Published: (2026) -
Rate optimal learning of equilibria from data
by: Freihaut, Till, et al.
Published: (2025) -
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods
by: Klein, Sara, et al.
Published: (2023)