Saved in:
| Main Authors: | Wang, Zijian, Wang, Bin, Shao, Mingwen, Dou, Hongbo, Tao, Boxiang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.02774 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Controllable Flow Matching for Online Reinforcement Learning
by: Wang, Bin, et al.
Published: (2025)
by: Wang, Bin, et al.
Published: (2025)
Walk Wisely on Graph: Knowledge Graph Reasoning with Dual Agents via Efficient Guidance-Exploration
by: Wang, Zijian, et al.
Published: (2024)
by: Wang, Zijian, et al.
Published: (2024)
Model-Based Exploration in Monitored Markov Decision Processes
by: Kazemipour, Alireza, et al.
Published: (2025)
by: Kazemipour, Alireza, et al.
Published: (2025)
Logarithmic Regret of Exploration in Average Reward Markov Decision Processes
by: Boone, Victor, et al.
Published: (2025)
by: Boone, Victor, et al.
Published: (2025)
Geometric Active Exploration in Markov Decision Processes: the Benefit of Abstraction
by: De Santi, Riccardo, et al.
Published: (2024)
by: De Santi, Riccardo, et al.
Published: (2024)
Optimistic Regret Bounds for Online Learning in Adversarial Markov Decision Processes
by: Moon, Sang Bin, et al.
Published: (2024)
by: Moon, Sang Bin, et al.
Published: (2024)
A Theoretical Analysis of State Similarity Between Markov Decision Processes
by: Tao, Zhenyu, et al.
Published: (2025)
by: Tao, Zhenyu, et al.
Published: (2025)
Policy Testing in Markov Decision Processes
by: Ariu, Kaito, et al.
Published: (2025)
by: Ariu, Kaito, et al.
Published: (2025)
Learning in Markov Decision Processes with Exogenous Dynamics
by: Maran, Davide, et al.
Published: (2026)
by: Maran, Davide, et al.
Published: (2026)
1-2-3-Go! Policy Synthesis for Parameterized Markov Decision Processes via Decision-Tree Learning and Generalization
by: Azeem, Muqsit, et al.
Published: (2024)
by: Azeem, Muqsit, et al.
Published: (2024)
Monitored Markov Decision Processes
by: Parisi, Simone, et al.
Published: (2024)
by: Parisi, Simone, et al.
Published: (2024)
Learning Utilities from Demonstrations in Markov Decision Processes
by: Lazzati, Filippo, et al.
Published: (2024)
by: Lazzati, Filippo, et al.
Published: (2024)
Policy Gradient for Robust Markov Decision Processes
by: Wang, Qiuhao, et al.
Published: (2024)
by: Wang, Qiuhao, et al.
Published: (2024)
Generalized Linear Markov Decision Process
by: Zhang, Sinian, et al.
Published: (2025)
by: Zhang, Sinian, et al.
Published: (2025)
Federated Control in Markov Decision Processes
by: Jin, Hao, et al.
Published: (2024)
by: Jin, Hao, et al.
Published: (2024)
Transition Transfer $Q$-Learning for Composite Markov Decision Processes
by: Chai, Jinhang, et al.
Published: (2025)
by: Chai, Jinhang, et al.
Published: (2025)
Learning Markov Decision Processes under Fully Bandit Feedback
by: Zhuo, Zhengjia, et al.
Published: (2026)
by: Zhuo, Zhengjia, et al.
Published: (2026)
OCMDP: Observation-Constrained Markov Decision Process
by: Wang, Taiyi, et al.
Published: (2024)
by: Wang, Taiyi, et al.
Published: (2024)
Model-based Reinforcement Learning for Parameterized Action Spaces
by: Zhang, Renhao, et al.
Published: (2024)
by: Zhang, Renhao, et al.
Published: (2024)
Optimal Sample Complexity for Average Reward Markov Decision Processes
by: Wang, Shengbo, et al.
Published: (2023)
by: Wang, Shengbo, et al.
Published: (2023)
Optimal Decision Tree Policies for Markov Decision Processes
by: Vos, Daniël, et al.
Published: (2023)
by: Vos, Daniël, et al.
Published: (2023)
Learning Deterministic Policies with Policy Gradients in Constrained Markov Decision Processes
by: Montenegro, Alessandro, et al.
Published: (2025)
by: Montenegro, Alessandro, et al.
Published: (2025)
Interaction-Grounded Learning for Contextual Markov Decision Processes with Personalized Feedback
by: Zhang, Mengxiao, et al.
Published: (2026)
by: Zhang, Mengxiao, et al.
Published: (2026)
Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning
by: Krau, Tatjana, et al.
Published: (2026)
by: Krau, Tatjana, et al.
Published: (2026)
Learning Constrained Markov Decision Processes With Non-stationary Rewards and Constraints
by: Stradi, Francesco Emanuele, et al.
Published: (2024)
by: Stradi, Francesco Emanuele, et al.
Published: (2024)
Almost Sure Convergence of Differential Temporal Difference Learning for Average Reward Markov Decision Processes
by: Blaser, Ethan, et al.
Published: (2026)
by: Blaser, Ethan, et al.
Published: (2026)
Performative Reinforcement Learning with Linear Markov Decision Process
by: Mandal, Debmalya, et al.
Published: (2024)
by: Mandal, Debmalya, et al.
Published: (2024)
Sample Complexity of Offline Distributionally Robust Linear Markov Decision Processes
by: Wang, He, et al.
Published: (2024)
by: Wang, He, et al.
Published: (2024)
Increasing Information for Model Predictive Control with Semi-Markov Decision Processes
by: Hosseinkhan-Boucher, Rémy, et al.
Published: (2025)
by: Hosseinkhan-Boucher, Rémy, et al.
Published: (2025)
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes
by: Mondal, Washim Uddin, et al.
Published: (2023)
by: Mondal, Washim Uddin, et al.
Published: (2023)
Markov Decision Processes under External Temporal Processes
by: Ayyagari, Ranga Shaarad, et al.
Published: (2023)
by: Ayyagari, Ranga Shaarad, et al.
Published: (2023)
Act as You Learn: Adaptive Decision-Making in Non-Stationary Markov Decision Processes
by: Luo, Baiting, et al.
Published: (2024)
by: Luo, Baiting, et al.
Published: (2024)
A Markov Decision Process for Variable Selection in Branch & Bound
by: Strang, Paul, et al.
Published: (2025)
by: Strang, Paul, et al.
Published: (2025)
Dynamic Deep-Reinforcement-Learning Algorithm in Partially Observable Markov Decision Processes
by: Omi, Saki, et al.
Published: (2023)
by: Omi, Saki, et al.
Published: (2023)
The regret lower bound for communicating Markov Decision Processes
by: Boone, Victor, et al.
Published: (2025)
by: Boone, Victor, et al.
Published: (2025)
An Orthogonal Learner for Individualized Outcomes in Markov Decision Processes
by: Javurek, Emil, et al.
Published: (2025)
by: Javurek, Emil, et al.
Published: (2025)
Initial Distribution Sensitivity of Constrained Markov Decision Processes
by: Tercan, Alperen, et al.
Published: (2025)
by: Tercan, Alperen, et al.
Published: (2025)
Improving Controller Generalization with Dimensionless Markov Decision Processes
by: Charvet, Valentin, et al.
Published: (2025)
by: Charvet, Valentin, et al.
Published: (2025)
Horizon-Free Regret for Linear Markov Decision Processes
by: Zhang, Zihan, et al.
Published: (2024)
by: Zhang, Zihan, et al.
Published: (2024)
Achieving Constant Regret in Linear Markov Decision Processes
by: Zhang, Weitong, et al.
Published: (2024)
by: Zhang, Weitong, et al.
Published: (2024)
Similar Items
-
Controllable Flow Matching for Online Reinforcement Learning
by: Wang, Bin, et al.
Published: (2025) -
Walk Wisely on Graph: Knowledge Graph Reasoning with Dual Agents via Efficient Guidance-Exploration
by: Wang, Zijian, et al.
Published: (2024) -
Model-Based Exploration in Monitored Markov Decision Processes
by: Kazemipour, Alireza, et al.
Published: (2025) -
Logarithmic Regret of Exploration in Average Reward Markov Decision Processes
by: Boone, Victor, et al.
Published: (2025) -
Geometric Active Exploration in Markov Decision Processes: the Benefit of Abstraction
by: De Santi, Riccardo, et al.
Published: (2024)