:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kohler, Hector, Akrour, Riad, Preux, Philippe
Format:	Preprint
Published:	2023
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2309.12701
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Limits of Actor-Critic Algorithms for Decision Tree Policies Learning in IBMDPs
by: Kohler, Hector, et al.
Published: (2023)

Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning
by: Kohler, Hector, et al.
Published: (2024)

Evaluating Interpretable Reinforcement Learning by Distilling Policies into Programs
by: Kohler, Hector, et al.
Published: (2025)

When (and How) to Trust the Expert: Diagnosing Query-Time Expert-Guided Reinforcement Learning
by: Berthelot, Yann, et al.
Published: (2026)

PB$^2$: Preference Space Exploration via Population-Based Methods in Preference-Based Reinforcement Learning
by: Driss, Brahim, et al.
Published: (2025)

Towards a Research Community in Interpretable Reinforcement Learning: the InterpPol Workshop
by: Kohler, Hector, et al.
Published: (2024)

StaQ it! Growing neural networks for Policy Mirror Descent
by: Shilova, Alena, et al.
Published: (2025)

Augmented Bayesian Policy Search
by: Kallel, Mahdi, et al.
Published: (2024)

IDEQ: an improved diffusion model for the TSP
by: Basson, Mickael, et al.
Published: (2024)

AdaStop: adaptive statistical testing for sound comparisons of Deep RL agents
by: Mathieu, Timothée, et al.
Published: (2023)

Bandits attack function optimization
by: Preux, Philippe, et al.
Published: (2026)

End-to-End Efficient RL for Linear Bellman Complete MDPs with Deterministic Transitions
by: Mhammedi, Zakaria, et al.
Published: (2026)

Prospect-Theory Behavior from Bellman Optimality in MDPs with Catastrophic States
by: Chen, Yujiao
Published: (2026)

Reward Redistribution for CVaR MDPs using a Bellman Operator on L-infinity
by: Muni, Aneri, et al.
Published: (2026)

Leo Breiman, the Rashomon Effect, and the Occam Dilemma
by: Rudin, Cynthia
Published: (2025)

RFX-Fuse: Breiman and Cutler's Unified ML Engine + Native Explainable Similarity
by: Kuchar, Chris
Published: (2026)

Optimal or Greedy Decision Trees? Revisiting their Objectives, Tuning, and Performance
by: van der Linden, Jacobus G. M., et al.
Published: (2024)

Robust Multi-Agent Path Finding under Observation Attacks: A Principled Adversarial-Plus-Smoothing Training Recipe
by: Ahmed, Riad
Published: (2026)

An Improved Model-Free Decision-Estimation Coefficient with Applications in Adversarial MDPs
by: Liu, Haolin, et al.
Published: (2025)

Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs
by: Zhong, Han, et al.
Published: (2021)

Revisiting Weighted Strategy for Non-stationary Parametric Bandits and MDPs
by: Wang, Jing, et al.
Published: (2026)

Predicting Multi-Drug Resistance in Bacterial Isolates Through Performance Comparison and LIME-based Interpretation of Classification Models
by: Wishal, Santanam, et al.
Published: (2026)

Bellman Diffusion Models
by: Schramm, Liam, et al.
Published: (2024)

Bellman Optimality of Average-Reward Robust Markov Decision Processes with a Constant Gain
by: Wang, Shengbo, et al.
Published: (2025)

Solving Non-Rectangular Reward-Robust MDPs via Frequency Regularization
by: Gadot, Uri, et al.
Published: (2023)

Provably Efficient Algorithms for S- and Non-Rectangular Robust MDPs with General Parameterization
by: Satheesh, Anirudh, et al.
Published: (2026)

Stability and Generalization for Bellman Residuals
by: Kang, Enoch H., et al.
Published: (2025)

Contraction-Aligned Analysis of Soft Bellman Residual Minimization with Weighted Lp-Norm for Markov Decision Problem
by: Yang, Hyukjun, et al.
Published: (2026)

Bellman Error Centering
by: Chen, Xingguo, et al.
Published: (2025)

Non-Adversarial Imitation Learning Provably Free of Compounding Errors: The Role of Bellman Constraints
by: Xu, Tian, et al.
Published: (2026)

Beyond Greedy Exits: Improved Early Exit Decisions for Risk Control and Reliability
by: Bajpai, Divya Jyoti, et al.
Published: (2025)

LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities
by: Schmied, Thomas, et al.
Published: (2025)

Beyond the Bellman Recursion: A Pontryagin-Guided Framework for Non-Exponential Discounting
by: Ko, Hojin, et al.
Published: (2026)

Accelerating Matrix Diagonalization through Decision Transformers with Epsilon-Greedy Optimization
by: Bhatta, Kshitij, et al.
Published: (2024)

Time-Constrained Robust MDPs
by: Zouitine, Adil, et al.
Published: (2024)

Parameterized Projected Bellman Operator
by: Vincent, Théo, et al.
Published: (2023)

Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning
by: Omura, Motoki, et al.
Published: (2025)

Distributional Bellman Operators over Mean Embeddings
by: Wenliang, Li Kevin, et al.
Published: (2023)

ShiQ: Bringing back Bellman to LLMs
by: Clavier, Pierre, et al.
Published: (2025)

Reinforcement Learning for Infinite-Horizon Average-Reward Linear MDPs via Approximation by Discounted-Reward MDPs
by: Hong, Kihyuk, et al.
Published: (2024)