Saved in:
| Main Authors: | Neu, Gergely, Papini, Matteo, Schwartz, Ludovic |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.15411 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Sparse Optimistic Information Directed Sampling
by: Schwartz, Ludovic, et al.
Published: (2025)
by: Schwartz, Ludovic, et al.
Published: (2025)
Optimistically Optimistic Exploration for Provably Efficient Infinite-Horizon Reinforcement and Imitation Learning
by: Moulin, Antoine, et al.
Published: (2025)
by: Moulin, Antoine, et al.
Published: (2025)
Distances for Markov chains from sample streams
by: Calo, Sergio, et al.
Published: (2025)
by: Calo, Sergio, et al.
Published: (2025)
Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently
by: Calo, Sergio, et al.
Published: (2024)
by: Calo, Sergio, et al.
Published: (2024)
Offline RL via Feature-Occupancy Gradient Ascent
by: Neu, Gergely, et al.
Published: (2024)
by: Neu, Gergely, et al.
Published: (2024)
Online combinatorial optimization with stochastic decision sets and adversarial losses
by: Neu, Gergely, et al.
Published: (2026)
by: Neu, Gergely, et al.
Published: (2026)
Online-to-PAC Conversions: Generalization Bounds via Regret Analysis
by: Lugosi, Gábor, et al.
Published: (2023)
by: Lugosi, Gábor, et al.
Published: (2023)
Dealing with unbounded gradients in stochastic saddle-point optimization
by: Neu, Gergely, et al.
Published: (2024)
by: Neu, Gergely, et al.
Published: (2024)
Policy Gradient with Active Importance Sampling
by: Papini, Matteo, et al.
Published: (2024)
by: Papini, Matteo, et al.
Published: (2024)
Online-to-PAC generalization bounds under graph-mixing dependencies
by: Abélès, Baptiste, et al.
Published: (2024)
by: Abélès, Baptiste, et al.
Published: (2024)
Generalization bounds for mixing processes via delayed online-to-PAC conversions
by: Abeles, Baptiste, et al.
Published: (2024)
by: Abeles, Baptiste, et al.
Published: (2024)
Inverse Q-Learning Done Right: Offline Imitation Learning in $Q^π$-Realizable MDPs
by: Moulin, Antoine, et al.
Published: (2025)
by: Moulin, Antoine, et al.
Published: (2025)
Online learning with Erdős-Rényi side-observation graphs
by: Kocák, Tomáš, et al.
Published: (2026)
by: Kocák, Tomáš, et al.
Published: (2026)
Online learning with noisy side observations
by: Kocák, Tomáš, et al.
Published: (2026)
by: Kocák, Tomáš, et al.
Published: (2026)
Actor-Critic with Active Importance Sampling
by: Molaei, Majid, et al.
Published: (2026)
by: Molaei, Majid, et al.
Published: (2026)
Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs
by: Maran, Davide, et al.
Published: (2024)
by: Maran, Davide, et al.
Published: (2024)
Linear Bandits with Non-i.i.d. Noise
by: Abélès, Baptiste, et al.
Published: (2025)
by: Abélès, Baptiste, et al.
Published: (2025)
Efficient learning by implicit exploration in bandit problems with side observations
by: Kocak, Tomas, et al.
Published: (2026)
by: Kocak, Tomas, et al.
Published: (2026)
Confidence Sequences for Generalized Linear Models via Regret Analysis
by: Clerico, Eugenio, et al.
Published: (2025)
by: Clerico, Eugenio, et al.
Published: (2025)
How Log-Barrier Helps Exploration in Policy Optimization
by: Cesani, Leonardo, et al.
Published: (2026)
by: Cesani, Leonardo, et al.
Published: (2026)
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
by: Montenegro, Alessandro, et al.
Published: (2024)
by: Montenegro, Alessandro, et al.
Published: (2024)
Local Linearity: the Key for No-regret Reinforcement Learning in Continuous MDPs
by: Maran, Davide, et al.
Published: (2024)
by: Maran, Davide, et al.
Published: (2024)
Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning
by: Montenegro, Alessandro, et al.
Published: (2024)
by: Montenegro, Alessandro, et al.
Published: (2024)
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
by: Bayrooti, Jasmine, et al.
Published: (2024)
by: Bayrooti, Jasmine, et al.
Published: (2024)
Impact of Connectivity on Laplacian Representations in Reinforcement Learning
by: Giorgi, Tommaso, et al.
Published: (2026)
by: Giorgi, Tommaso, et al.
Published: (2026)
Optimistic Thompson Sampling for No-Regret Learning in Unknown Games
by: Li, Yingru, et al.
Published: (2024)
by: Li, Yingru, et al.
Published: (2024)
Statistical Analysis of Policy Space Compression Problem
by: Molaei, Majid, et al.
Published: (2024)
by: Molaei, Majid, et al.
Published: (2024)
No-Regret Reinforcement Learning in Smooth MDPs
by: Maran, Davide, et al.
Published: (2024)
by: Maran, Davide, et al.
Published: (2024)
Optimistic Policy Regularization
by: Pham, Mai, et al.
Published: (2026)
by: Pham, Mai, et al.
Published: (2026)
Do It for HER: First-Order Temporal Logic Reward Specification in Reinforcement Learning (Extended Version)
by: Olivieri, Pierriccardo, et al.
Published: (2026)
by: Olivieri, Pierriccardo, et al.
Published: (2026)
Reusing Trajectories in Policy Gradients Enables Fast Convergence
by: Montenegro, Alessandro, et al.
Published: (2025)
by: Montenegro, Alessandro, et al.
Published: (2025)
Learning Deterministic Policies with Policy Gradients in Constrained Markov Decision Processes
by: Montenegro, Alessandro, et al.
Published: (2025)
by: Montenegro, Alessandro, et al.
Published: (2025)
Omega: Optimistic EMA Gradients
by: Ramirez, Juan, et al.
Published: (2023)
by: Ramirez, Juan, et al.
Published: (2023)
Optimistic Learning for Communication Networks
by: Iosifidis, George, et al.
Published: (2025)
by: Iosifidis, George, et al.
Published: (2025)
Bayesian Optimistic Optimisation with Exponentially Decaying Regret
by: Tran-The, Hung, et al.
Published: (2021)
by: Tran-The, Hung, et al.
Published: (2021)
Optimistic Dual Averaging Unifies Modern Optimizers
by: Pethick, Thomas, et al.
Published: (2026)
by: Pethick, Thomas, et al.
Published: (2026)
Optimistic critics can empower small actors
by: Mastikhina, Olya, et al.
Published: (2025)
by: Mastikhina, Olya, et al.
Published: (2025)
SOMBRL: Scalable and Optimistic Model-Based RL
by: Sukhija, Bhavya, et al.
Published: (2025)
by: Sukhija, Bhavya, et al.
Published: (2025)
Optimistic Task Inference for Behavior Foundation Models
by: Rupf, Thomas, et al.
Published: (2025)
by: Rupf, Thomas, et al.
Published: (2025)
Optimistic Reinforcement Learning with Quantile Objectives
by: Alipour-Vaezi, Mohammad, et al.
Published: (2025)
by: Alipour-Vaezi, Mohammad, et al.
Published: (2025)
Similar Items
-
Sparse Optimistic Information Directed Sampling
by: Schwartz, Ludovic, et al.
Published: (2025) -
Optimistically Optimistic Exploration for Provably Efficient Infinite-Horizon Reinforcement and Imitation Learning
by: Moulin, Antoine, et al.
Published: (2025) -
Distances for Markov chains from sample streams
by: Calo, Sergio, et al.
Published: (2025) -
Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently
by: Calo, Sergio, et al.
Published: (2024) -
Offline RL via Feature-Occupancy Gradient Ascent
by: Neu, Gergely, et al.
Published: (2024)