:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Qiao, Chuhan
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2604.17910
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
by: Agnihotri, Akhil, et al.
Published: (2023)

CausalFlow: Causal Attribution and Counterfactual Repair for LLM Agent Failures
by: Bonagiri, Akash, et al.
Published: (2026)

SCOPE: Sequential Causal Optimization of Process Interventions
by: De Moor, Jakob, et al.
Published: (2025)

No-Regret Reinforcement Learning in Smooth MDPs
by: Maran, Davide, et al.
Published: (2024)

Differentiable Constraint-Based Causal Discovery
by: Zhou, Jincheng, et al.
Published: (2025)

Linear Causal Discovery with Interventional Constraints
by: Guo, Zhigao, et al.
Published: (2025)

Low-Rank MDPs with Continuous Action Spaces
by: Bennett, Andrew, et al.
Published: (2023)

Efficient Solution and Learning of Robust Factored MDPs
by: Schnitzer, Yannik, et al.
Published: (2025)

Missingness-MDPs: Bridging the Theory of Missing Data and POMDPs
by: Wendland, Joshua, et al.
Published: (2026)

Geometry of Drifting MDPs with Path-Integral Stability Certificates
by: Zhang, Zuyuan, et al.
Published: (2026)

Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs
by: Shakerinava, Mehran, et al.
Published: (2025)

SnareNet: Flexible Repair Layers for Neural Networks with Hard Constraints
by: Chu, Ya-Chi, et al.
Published: (2026)

Robust Causal Discovery under Imperfect Structural Constraints
by: Wang, Zidong, et al.
Published: (2025)

On-line Learning in Tree MDPs by Treating Policies as Bandit Arms
by: Shah, Anvay, et al.
Published: (2026)

Learning Policy Committees for Effective Personalization in MDPs with Diverse Tasks
by: Ge, Luise, et al.
Published: (2025)

Last-Iterate Convergence of General Parameterized Policies in Constrained MDPs
by: Mondal, Washim Uddin, et al.
Published: (2024)

Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving
by: Zimmer, Matthieu, et al.
Published: (2025)

Exploration Implies Data Augmentation: Reachability and Generalisation in Contextual MDPs
by: Weltevrede, Max, et al.
Published: (2024)

Solving Multi-Model MDPs by Coordinate Ascent and Dynamic Programming
by: Su, Xihong, et al.
Published: (2024)

Risk-averse Total-reward MDPs with ERM and EVaR
by: Su, Xihong, et al.
Published: (2024)

On the Convergence of Monte Carlo UCB for Random-Length Episodic MDPs
by: Dong, Zixuan, et al.
Published: (2022)

Goal-Oriented Sequential Bayesian Experimental Design for Causal Learning
by: Zhang, Zheyu, et al.
Published: (2025)

A Survey of Pipeline Tools for Data Engineering
by: Mbata, Anthony, et al.
Published: (2024)

Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds
by: Zuo, Qian, et al.
Published: (2025)

Bad Values but Good Behavior: Learning Highly Misspecified Bandits and MDPs
by: Banerjee, Debangshu, et al.
Published: (2023)

Physics-Informed Machine Learning in Biomedical Science and Engineering
by: Ahmadi, Nazanin, et al.
Published: (2025)

SNAP: Sequential Non-Ancestor Pruning for Targeted Causal Effect Estimation With an Unknown Graph
by: Schubert, Mátyás, et al.
Published: (2025)

Causal Discovery from Poisson Branching Structural Causal Model Using High-Order Cumulant with Path Analysis
by: Qiao, Jie, et al.
Published: (2024)

Social Physics Informed Diffusion Model for Crowd Simulation
by: Chen, Hongyi, et al.
Published: (2024)

GRACE-C: Generalized Rate Agnostic Causal Estimation via Constraints
by: Abavisani, Mohammadsajad, et al.
Published: (2022)

A Computationally Efficient Algorithm for Infinite-Horizon Average-Reward Linear MDPs
by: Hong, Kihyuk, et al.
Published: (2025)

DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs
by: Shrestha, Aayam, et al.
Published: (2020)

Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs
by: Maran, Davide, et al.
Published: (2024)

RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk
by: Hau, Jia Lin, et al.
Published: (2022)

Sample and Oracle Efficient Reinforcement Learning for MDPs with Linearly-Realizable Value Functions
by: Mhammedi, Zakaria
Published: (2024)

CauScale: Neural Causal Discovery at Scale
by: Peng, Bo, et al.
Published: (2026)

Learning by Doing: An Online Causal Reinforcement Learning Framework with Causal-Aware Policy
by: Cai, Ruichu, et al.
Published: (2024)

Second-Order Actor-Critic Methods for Discounted MDPs via Policy Hessian Decomposition
by: Manivannan, Sanjeev, et al.
Published: (2026)

Reward Redistribution for CVaR MDPs using a Bellman Operator on L-infinity
by: Muni, Aneri, et al.
Published: (2026)

Global Convergence for Average Reward Constrained MDPs with Primal-Dual Actor Critic Algorithm
by: Xu, Yang, et al.
Published: (2025)