:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Weissmann, Simon, Freihaut, Till, Vernade, Claire, Ramponi, Giorgia, Döring, Leif
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2503.02735
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

On Feasible Rewards in Multi-Agent Inverse Reinforcement Learning
by: Freihaut, Till, et al.
Published: (2024)

Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning
by: Freihaut, Till, et al.
Published: (2025)

Multi-agent imitation learning with function approximation: Linear Markov games and beyond
by: Viano, Luca, et al.
Published: (2026)

Rate optimal learning of equilibria from data
by: Freihaut, Till, et al.
Published: (2025)

Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods
by: Klein, Sara, et al.
Published: (2023)

Controlling the Flow: Stability and Convergence for Stochastic Gradient Descent with Decaying Regularization
by: Kassing, Sebastian, et al.
Published: (2025)

Almost sure convergence rates of stochastic gradient methods under gradient domination
by: Weissmann, Simon, et al.
Published: (2024)

The Role of Target Update Frequencies in Q-Learning
by: Weissmann, Simon, et al.
Published: (2026)

Structure Matters: Dynamic Policy Gradient
by: Klein, Sara, et al.
Published: (2024)

Tight Sample Complexity Bounds for Entropic Best Policy Identification
by: Essakine, Amer, et al.
Published: (2026)

Fine-tuning Behavioral Cloning Policies with Preference-Based Reinforcement Learning
by: Macuglia, Maël, et al.
Published: (2025)

Partially Observable Reinforcement Learning with Memory Traces
by: Eberhard, Onno, et al.
Published: (2025)

Non-Stationary Lipschitz Bandits
by: Nguyen, Nicolas, et al.
Published: (2025)

Commit to the Bit: Reactive Reinforcement Learning Done Right
by: Eberhard, Onno, et al.
Published: (2026)

Variational Bayes Portfolio Construction
by: Nguyen, Nicolas, et al.
Published: (2024)

Random Function Descent
by: Benning, Felix, et al.
Published: (2023)

A Pontryagin Perspective on Reinforcement Learning
by: Eberhard, Onno, et al.
Published: (2024)

Put CASH on Bandits: A Max K-Armed Problem for Automated Machine Learning
by: Balef, Amir Rezaei, et al.
Published: (2025)

Preference Elicitation for Offline Reinforcement Learning
by: Pace, Alizée, et al.
Published: (2024)

Split the Differences, Pool the Rest: Provably Efficient Multi-Objective Imitation
by: Sheebaelhamd, Ziyad, et al.
Published: (2026)

Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits
by: Nguyen, Nicolas, et al.
Published: (2024)

Online Decision Deferral under Budget Constraints
by: Reid, Mirabel, et al.
Published: (2024)

Quantization-Free Autoregressive Action Transformer
by: Sheebaelhamd, Ziyad, et al.
Published: (2025)

Truly No-Regret Learning in Constrained MDPs
by: Müller, Adrian, et al.
Published: (2024)

Gradient Span Algorithms Make Predictable Progress in High Dimension
by: Benning, Felix, et al.
Published: (2024)

Exploiting Causal Graph Priors with Posterior Sampling for Reinforcement Learning
by: Mutti, Mirco, et al.
Published: (2023)

On the Convergence of Single-Timescale Actor-Critic
by: Kumar, Navdeep, et al.
Published: (2024)

Efficient Risk-sensitive Planning via Entropic Risk Measures
by: Marthe, Alexandre, et al.
Published: (2025)

Contextual Bilevel Reinforcement Learning for Incentive Alignment
by: Thoma, Vinzenz, et al.
Published: (2024)

Learning Acrobatic Flight from Preferences
by: Merk, Colin, et al.
Published: (2025)

An Approximate Ascent Approach To Prove Convergence of PPO
by: Doering, Leif, et al.
Published: (2026)

Policy Gradient with Tree Search: Avoiding Local Optimas through Lookahead
by: Koren, Uri, et al.
Published: (2025)

ADDQ: Adaptive Distributional Double Q-Learning
by: Döring, Leif, et al.
Published: (2025)

Optimal Sample Complexity for Single Time-Scale Actor-Critic with Momentum
by: Kumar, Navdeep, et al.
Published: (2026)

An in depth look at the Procrustes-Wasserstein distance: properties and barycenters
by: Adamo, Davide, et al.
Published: (2025)

Dual Formulation for Non-Rectangular Lp Robust Markov Decision Processes
by: Kumar, Navdeep, et al.
Published: (2025)

MAVRL: Learning Reward Functions from Multiple Feedback Types with Amortized Variational Inference
by: Baur, Raphaël, et al.
Published: (2026)

Aligning Language Models from User Interactions
by: Buening, Thomas Kleine, et al.
Published: (2026)

Adaptive Kernel Selection for Stein Variational Gradient Descent
by: Melcher, Moritz, et al.
Published: (2025)

Beyond R-barycenters: an effective averaging method on Stiefel and Grassmann manifolds
by: Bouchard, Florent, et al.
Published: (2025)