Saved in:
| Main Author: | Qiao, Chuhan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.17910 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
by: Agnihotri, Akhil, et al.
Published: (2023)
by: Agnihotri, Akhil, et al.
Published: (2023)
CausalFlow: Causal Attribution and Counterfactual Repair for LLM Agent Failures
by: Bonagiri, Akash, et al.
Published: (2026)
by: Bonagiri, Akash, et al.
Published: (2026)
SCOPE: Sequential Causal Optimization of Process Interventions
by: De Moor, Jakob, et al.
Published: (2025)
by: De Moor, Jakob, et al.
Published: (2025)
No-Regret Reinforcement Learning in Smooth MDPs
by: Maran, Davide, et al.
Published: (2024)
by: Maran, Davide, et al.
Published: (2024)
Differentiable Constraint-Based Causal Discovery
by: Zhou, Jincheng, et al.
Published: (2025)
by: Zhou, Jincheng, et al.
Published: (2025)
Linear Causal Discovery with Interventional Constraints
by: Guo, Zhigao, et al.
Published: (2025)
by: Guo, Zhigao, et al.
Published: (2025)
Low-Rank MDPs with Continuous Action Spaces
by: Bennett, Andrew, et al.
Published: (2023)
by: Bennett, Andrew, et al.
Published: (2023)
Efficient Solution and Learning of Robust Factored MDPs
by: Schnitzer, Yannik, et al.
Published: (2025)
by: Schnitzer, Yannik, et al.
Published: (2025)
Missingness-MDPs: Bridging the Theory of Missing Data and POMDPs
by: Wendland, Joshua, et al.
Published: (2026)
by: Wendland, Joshua, et al.
Published: (2026)
Geometry of Drifting MDPs with Path-Integral Stability Certificates
by: Zhang, Zuyuan, et al.
Published: (2026)
by: Zhang, Zuyuan, et al.
Published: (2026)
Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs
by: Shakerinava, Mehran, et al.
Published: (2025)
by: Shakerinava, Mehran, et al.
Published: (2025)
SnareNet: Flexible Repair Layers for Neural Networks with Hard Constraints
by: Chu, Ya-Chi, et al.
Published: (2026)
by: Chu, Ya-Chi, et al.
Published: (2026)
Robust Causal Discovery under Imperfect Structural Constraints
by: Wang, Zidong, et al.
Published: (2025)
by: Wang, Zidong, et al.
Published: (2025)
On-line Learning in Tree MDPs by Treating Policies as Bandit Arms
by: Shah, Anvay, et al.
Published: (2026)
by: Shah, Anvay, et al.
Published: (2026)
Learning Policy Committees for Effective Personalization in MDPs with Diverse Tasks
by: Ge, Luise, et al.
Published: (2025)
by: Ge, Luise, et al.
Published: (2025)
Last-Iterate Convergence of General Parameterized Policies in Constrained MDPs
by: Mondal, Washim Uddin, et al.
Published: (2024)
by: Mondal, Washim Uddin, et al.
Published: (2024)
Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving
by: Zimmer, Matthieu, et al.
Published: (2025)
by: Zimmer, Matthieu, et al.
Published: (2025)
Exploration Implies Data Augmentation: Reachability and Generalisation in Contextual MDPs
by: Weltevrede, Max, et al.
Published: (2024)
by: Weltevrede, Max, et al.
Published: (2024)
Solving Multi-Model MDPs by Coordinate Ascent and Dynamic Programming
by: Su, Xihong, et al.
Published: (2024)
by: Su, Xihong, et al.
Published: (2024)
Risk-averse Total-reward MDPs with ERM and EVaR
by: Su, Xihong, et al.
Published: (2024)
by: Su, Xihong, et al.
Published: (2024)
On the Convergence of Monte Carlo UCB for Random-Length Episodic MDPs
by: Dong, Zixuan, et al.
Published: (2022)
by: Dong, Zixuan, et al.
Published: (2022)
Goal-Oriented Sequential Bayesian Experimental Design for Causal Learning
by: Zhang, Zheyu, et al.
Published: (2025)
by: Zhang, Zheyu, et al.
Published: (2025)
A Survey of Pipeline Tools for Data Engineering
by: Mbata, Anthony, et al.
Published: (2024)
by: Mbata, Anthony, et al.
Published: (2024)
Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds
by: Zuo, Qian, et al.
Published: (2025)
by: Zuo, Qian, et al.
Published: (2025)
Bad Values but Good Behavior: Learning Highly Misspecified Bandits and MDPs
by: Banerjee, Debangshu, et al.
Published: (2023)
by: Banerjee, Debangshu, et al.
Published: (2023)
Physics-Informed Machine Learning in Biomedical Science and Engineering
by: Ahmadi, Nazanin, et al.
Published: (2025)
by: Ahmadi, Nazanin, et al.
Published: (2025)
SNAP: Sequential Non-Ancestor Pruning for Targeted Causal Effect Estimation With an Unknown Graph
by: Schubert, Mátyás, et al.
Published: (2025)
by: Schubert, Mátyás, et al.
Published: (2025)
Causal Discovery from Poisson Branching Structural Causal Model Using High-Order Cumulant with Path Analysis
by: Qiao, Jie, et al.
Published: (2024)
by: Qiao, Jie, et al.
Published: (2024)
Social Physics Informed Diffusion Model for Crowd Simulation
by: Chen, Hongyi, et al.
Published: (2024)
by: Chen, Hongyi, et al.
Published: (2024)
GRACE-C: Generalized Rate Agnostic Causal Estimation via Constraints
by: Abavisani, Mohammadsajad, et al.
Published: (2022)
by: Abavisani, Mohammadsajad, et al.
Published: (2022)
A Computationally Efficient Algorithm for Infinite-Horizon Average-Reward Linear MDPs
by: Hong, Kihyuk, et al.
Published: (2025)
by: Hong, Kihyuk, et al.
Published: (2025)
DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs
by: Shrestha, Aayam, et al.
Published: (2020)
by: Shrestha, Aayam, et al.
Published: (2020)
Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs
by: Maran, Davide, et al.
Published: (2024)
by: Maran, Davide, et al.
Published: (2024)
RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk
by: Hau, Jia Lin, et al.
Published: (2022)
by: Hau, Jia Lin, et al.
Published: (2022)
Sample and Oracle Efficient Reinforcement Learning for MDPs with Linearly-Realizable Value Functions
by: Mhammedi, Zakaria
Published: (2024)
by: Mhammedi, Zakaria
Published: (2024)
CauScale: Neural Causal Discovery at Scale
by: Peng, Bo, et al.
Published: (2026)
by: Peng, Bo, et al.
Published: (2026)
Learning by Doing: An Online Causal Reinforcement Learning Framework with Causal-Aware Policy
by: Cai, Ruichu, et al.
Published: (2024)
by: Cai, Ruichu, et al.
Published: (2024)
Second-Order Actor-Critic Methods for Discounted MDPs via Policy Hessian Decomposition
by: Manivannan, Sanjeev, et al.
Published: (2026)
by: Manivannan, Sanjeev, et al.
Published: (2026)
Reward Redistribution for CVaR MDPs using a Bellman Operator on L-infinity
by: Muni, Aneri, et al.
Published: (2026)
by: Muni, Aneri, et al.
Published: (2026)
Global Convergence for Average Reward Constrained MDPs with Primal-Dual Actor Critic Algorithm
by: Xu, Yang, et al.
Published: (2025)
by: Xu, Yang, et al.
Published: (2025)
Similar Items
-
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
by: Agnihotri, Akhil, et al.
Published: (2023) -
CausalFlow: Causal Attribution and Counterfactual Repair for LLM Agent Failures
by: Bonagiri, Akash, et al.
Published: (2026) -
SCOPE: Sequential Causal Optimization of Process Interventions
by: De Moor, Jakob, et al.
Published: (2025) -
No-Regret Reinforcement Learning in Smooth MDPs
by: Maran, Davide, et al.
Published: (2024) -
Differentiable Constraint-Based Causal Discovery
by: Zhou, Jincheng, et al.
Published: (2025)