Saved in:
| Main Authors: | Chen, Zequn, Marrero, Wesley J. |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.04334 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
KFCPO: Kronecker-Factored Approximated Constrained Policy Optimization
by: Lim, Joonyoung, et al.
Published: (2025)
by: Lim, Joonyoung, et al.
Published: (2025)
Thompson Sampling for Infinite-Horizon Discounted Decision Processes
by: Adelman, Daniel, et al.
Published: (2024)
by: Adelman, Daniel, et al.
Published: (2024)
A Robust Model-Based Approach for Continuous-Time Policy Evaluation with Unknown Lévy Process Dynamics
by: Ye, Qihao, et al.
Published: (2025)
by: Ye, Qihao, et al.
Published: (2025)
Communicating Plans, Not Percepts: Scalable Multi-Agent Coordination with Embodied World Models
by: Hill, Brennen A., et al.
Published: (2025)
by: Hill, Brennen A., et al.
Published: (2025)
Integrating Vision Foundation Models with Reinforcement Learning for Enhanced Object Interaction
by: Farooq, Ahmad, et al.
Published: (2025)
by: Farooq, Ahmad, et al.
Published: (2025)
Monotone and Conservative Policy Iteration Beyond the Tabular Case
by: Eshwar, S. R., et al.
Published: (2025)
by: Eshwar, S. R., et al.
Published: (2025)
Adaptive Reward Design for Reinforcement Learning
by: Kwon, Minjae, et al.
Published: (2024)
by: Kwon, Minjae, et al.
Published: (2024)
Autonomous AI Agents for Real-Time Affordable Housing Site Selection: Multi-Objective Reinforcement Learning Under Regulatory Constraints
by: Imanov, Olaf Yunus Laitinen, et al.
Published: (2026)
by: Imanov, Olaf Yunus Laitinen, et al.
Published: (2026)
Bayesian Conservative Policy Optimization (BCPO): A Novel Uncertainty-Calibrated Offline Reinforcement Learning with Credible Lower Bounds
by: Chatterjee, Debashis
Published: (2026)
by: Chatterjee, Debashis
Published: (2026)
Convex Regularization and Convergence of Policy Gradient Flows under Safety Constraints
by: Malo, Pekka, et al.
Published: (2024)
by: Malo, Pekka, et al.
Published: (2024)
Deep neural networks can provably solve Bellman equations for Markov decision processes without the curse of dimensionality
by: Jentzen, Arnulf, et al.
Published: (2025)
by: Jentzen, Arnulf, et al.
Published: (2025)
Beyond Bellman: High-Order Generator Regression for Continuous-Time Policy Evaluation
by: Zheng, Yaowei, et al.
Published: (2026)
by: Zheng, Yaowei, et al.
Published: (2026)
Reinforcement Learning Methods for the Stochastic Optimal Control of an Industrial Power-to-Heat System
by: Pilling, Eric, et al.
Published: (2024)
by: Pilling, Eric, et al.
Published: (2024)
PAC-MCTS: Bias-Aware Pruning for Robust LLM-Guided Search and Planning
by: Qian, Tianhao
Published: (2026)
by: Qian, Tianhao
Published: (2026)
Machine Learning Algorithms for Improving Black Box Optimization Solvers
by: Kimiaei, Morteza, et al.
Published: (2025)
by: Kimiaei, Morteza, et al.
Published: (2025)
On Policy Evaluation Algorithms in Distributional Reinforcement Learning
by: Gerstenberg, Julian, et al.
Published: (2024)
by: Gerstenberg, Julian, et al.
Published: (2024)
Stability and Sensitivity Analysis of Relative Temporal-Difference Learning: Extended Version
by: Sakha, Masoud S., et al.
Published: (2026)
by: Sakha, Masoud S., et al.
Published: (2026)
Operator-Theoretic Foundations and Policy Gradient Methods for General MDPs with Unbounded Costs
by: Gupta, Abhishek, et al.
Published: (2026)
by: Gupta, Abhishek, et al.
Published: (2026)
Policy stability and ultimate stationarity in discounted risk-sensitive stochastic control
by: Bäuerle, Nicole, et al.
Published: (2026)
by: Bäuerle, Nicole, et al.
Published: (2026)
Blackwell optimality and policy stability for long-run risk sensitive stochastic control
by: Bäuerle, Nicole, et al.
Published: (2024)
by: Bäuerle, Nicole, et al.
Published: (2024)
Optimistic Training and Convergence of Q-Learning -- Extended Version
by: Mehta, Prashant, et al.
Published: (2026)
by: Mehta, Prashant, et al.
Published: (2026)
Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study
by: Huang, Yilie, et al.
Published: (2024)
by: Huang, Yilie, et al.
Published: (2024)
Neural Actor-Critic Methods for Hamilton-Jacobi-Bellman PDEs: Asymptotic Analysis and Numerical Studies
by: Cohen, Samuel N., et al.
Published: (2025)
by: Cohen, Samuel N., et al.
Published: (2025)
Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty
by: Jia, Yanwei
Published: (2024)
by: Jia, Yanwei
Published: (2024)
Modeling Vehicle-Type-Specific Pedestrian Crash Avoidance Behavior in Safety-Critical Interactions Using Smooth-Mamba Deep Reinforcement Learning
by: Pu, Qingwen, et al.
Published: (2026)
by: Pu, Qingwen, et al.
Published: (2026)
Convergence proofs and strong error bounds for forward-backward stochastic differential equations using neural network simulations
by: Sheridan-Methven, Oliver
Published: (2024)
by: Sheridan-Methven, Oliver
Published: (2024)
Asynchronous Stochastic Approximation with Applications to Average-Reward Reinforcement Learning
by: Yu, Huizhen, et al.
Published: (2024)
by: Yu, Huizhen, et al.
Published: (2024)
Generative World Models of Tasks: LLM-Driven Hierarchical Scaffolding for Embodied Agents
by: Hill, Brennen
Published: (2025)
by: Hill, Brennen
Published: (2025)
A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces
by: Kerimkulov, Bekzhan, et al.
Published: (2023)
by: Kerimkulov, Bekzhan, et al.
Published: (2023)
An Optimal-Control Approach to Infinite-Horizon Restless Bandits: Achieving Asymptotic Optimality with Minimal Assumptions
by: YAN, Chen
Published: (2024)
by: YAN, Chen
Published: (2024)
Measurement-Driven Early Warning of Reliability Breakdown in 5G NSA Railway Networks
by: Chou, Po-Heng, et al.
Published: (2025)
by: Chou, Po-Heng, et al.
Published: (2025)
SALE-Based Offline Reinforcement Learning with Ensemble Q-Networks
by: Chun, Zheng
Published: (2025)
by: Chun, Zheng
Published: (2025)
Existence of bounded solutions to multiplicative Poisson equations under mixing property
by: Pitera, Marcin, et al.
Published: (2023)
by: Pitera, Marcin, et al.
Published: (2023)
A Note on Stability in Asynchronous Stochastic Approximation without Communication Delays
by: Yu, Huizhen, et al.
Published: (2023)
by: Yu, Huizhen, et al.
Published: (2023)
Non-Expansive Mappings in Two-Time-Scale Stochastic Approximation: Finite-Time Analysis
by: Chandak, Siddharth
Published: (2025)
by: Chandak, Siddharth
Published: (2025)
Securing Radiation Detection Systems with an Efficient TinyML-Based IDS for Edge Devices
by: Pizarro, Einstein Rivas, et al.
Published: (2025)
by: Pizarro, Einstein Rivas, et al.
Published: (2025)
An Efficient Intrusion Detection System for Safeguarding Radiation Detection Systems
by: Coolidge, Nathanael, et al.
Published: (2025)
by: Coolidge, Nathanael, et al.
Published: (2025)
Probabilistic Approach to Black-Box Binary Optimization with Budget Constraints: Application to Sensor Placement
by: Attia, Ahmed
Published: (2024)
by: Attia, Ahmed
Published: (2024)
Two-Steps Diffusion Policy for Robotic Manipulation via Genetic Denoising
by: Clemente, Mateo, et al.
Published: (2025)
by: Clemente, Mateo, et al.
Published: (2025)
From Sparse to Dense: Toddler-inspired Reward Transition in Goal-Oriented Reinforcement Learning
by: Park, Junseok, et al.
Published: (2025)
by: Park, Junseok, et al.
Published: (2025)
Similar Items
-
KFCPO: Kronecker-Factored Approximated Constrained Policy Optimization
by: Lim, Joonyoung, et al.
Published: (2025) -
Thompson Sampling for Infinite-Horizon Discounted Decision Processes
by: Adelman, Daniel, et al.
Published: (2024) -
A Robust Model-Based Approach for Continuous-Time Policy Evaluation with Unknown Lévy Process Dynamics
by: Ye, Qihao, et al.
Published: (2025) -
Communicating Plans, Not Percepts: Scalable Multi-Agent Coordination with Embodied World Models
by: Hill, Brennen A., et al.
Published: (2025) -
Integrating Vision Foundation Models with Reinforcement Learning for Enhanced Object Interaction
by: Farooq, Ahmad, et al.
Published: (2025)