Saved in:
| Main Authors: | Zhu, Youheng, Lu, Yiping |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.03191 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Recurrent Natural Policy Gradient for POMDPs
by: Cayci, Semih, et al.
Published: (2024)
by: Cayci, Semih, et al.
Published: (2024)
Residuals-based Offline Reinforcement Learning
by: Zhu, Qing, et al.
Published: (2026)
by: Zhu, Qing, et al.
Published: (2026)
Solving Truly Massive Budgeted Monotonic POMDPs with Oracle-Guided Meta-Reinforcement Learning
by: Vora, Manav, et al.
Published: (2024)
by: Vora, Manav, et al.
Published: (2024)
Offline Reinforcement Learning via Inverse Optimization
by: Dimanidis, Ioannis, et al.
Published: (2025)
by: Dimanidis, Ioannis, et al.
Published: (2025)
Dual Control of Linear Systems from Bilinear Observations with Belief Space Model Predictive Control
by: Cao, Daniel, et al.
Published: (2026)
by: Cao, Daniel, et al.
Published: (2026)
Offline Hierarchical Reinforcement Learning via Inverse Optimization
by: Schmidt, Carolin, et al.
Published: (2024)
by: Schmidt, Carolin, et al.
Published: (2024)
Operator Models for Continuous-Time Offline Reinforcement Learning
by: Hoischen, Nicolas, et al.
Published: (2025)
by: Hoischen, Nicolas, et al.
Published: (2025)
On the Width Scaling of Neural Optimizers Under Matrix Operator Norms I: Row/Column Normalization and Hyperparameter Transfer
by: Xu, Ruihan, et al.
Published: (2026)
by: Xu, Ruihan, et al.
Published: (2026)
Offline-Online Reinforcement Learning for Linear Mixture MDPs
by: Zhang, Zhongjun, et al.
Published: (2026)
by: Zhang, Zhongjun, et al.
Published: (2026)
Reward-Relevance-Filtered Linear Offline Reinforcement Learning
by: Zhou, Angela
Published: (2024)
by: Zhou, Angela
Published: (2024)
Online Residual Learning from Offline Experts for Pedestrian Tracking
by: Vlachos, Anastasios, et al.
Published: (2024)
by: Vlachos, Anastasios, et al.
Published: (2024)
Offline Policy Learning with Weight Clipping and Heaviside Composite Optimization
by: Liu, Jingren, et al.
Published: (2026)
by: Liu, Jingren, et al.
Published: (2026)
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning
by: Di, Qiwei, et al.
Published: (2023)
by: Di, Qiwei, et al.
Published: (2023)
PAC-Bayes Meets Online Contextual Optimization
by: Xie, Zhuojun, et al.
Published: (2025)
by: Xie, Zhuojun, et al.
Published: (2025)
Offline Reinforcement Learning via Linear-Programming with Error-Bound Induced Constraints
by: Ozdaglar, Asuman, et al.
Published: (2022)
by: Ozdaglar, Asuman, et al.
Published: (2022)
No-Rank Tensor Decomposition Using Metric Learning
by: Bagherian, Maryam
Published: (2025)
by: Bagherian, Maryam
Published: (2025)
SPP-SBL: Space-Power Prior Sparse Bayesian Learning for Block Sparse Recovery
by: Zhang, Yanhao, et al.
Published: (2025)
by: Zhang, Yanhao, et al.
Published: (2025)
Learning to Cover: Online Learning and Optimization with Irreversible Decisions
by: Jacquillat, Alexandre, et al.
Published: (2024)
by: Jacquillat, Alexandre, et al.
Published: (2024)
SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
by: Zhu, Shuchen, et al.
Published: (2024)
by: Zhu, Shuchen, et al.
Published: (2024)
OptScaler: A Collaborative Framework for Robust Autoscaling in the Cloud
by: Zou, Ding, et al.
Published: (2023)
by: Zou, Ding, et al.
Published: (2023)
Towards Optimal Offline Reinforcement Learning
by: Li, Mengmeng, et al.
Published: (2025)
by: Li, Mengmeng, et al.
Published: (2025)
Wait-Less Offline Tuning and Re-solving for Online Decision Making
by: Sun, Jingruo, et al.
Published: (2024)
by: Sun, Jingruo, et al.
Published: (2024)
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning
by: Zhang, Dake, et al.
Published: (2024)
by: Zhang, Dake, et al.
Published: (2024)
Modeling Hierarchical Spaces: A Review and Unified Framework for Surrogate-Based Architecture Design
by: Saves, Paul, et al.
Published: (2025)
by: Saves, Paul, et al.
Published: (2025)
Unsupervised Ground Metric Learning
by: Auffenberg, Janis, et al.
Published: (2025)
by: Auffenberg, Janis, et al.
Published: (2025)
A Distance Metric for Mixed Integer Programming Instances
by: Maudet, Gwen, et al.
Published: (2025)
by: Maudet, Gwen, et al.
Published: (2025)
Formation Shape Control using the Gromov-Wasserstein Metric
by: Nakashima, Haruto, et al.
Published: (2025)
by: Nakashima, Haruto, et al.
Published: (2025)
Distributionally Robust Optimization via Iterative Algorithms in Continuous Probability Spaces
by: Zhu, Linglingzhi, et al.
Published: (2024)
by: Zhu, Linglingzhi, et al.
Published: (2024)
A Framework for Adaptive Stabilisation of Nonlinear Stochastic Systems
by: Siriya, Seth, et al.
Published: (2025)
by: Siriya, Seth, et al.
Published: (2025)
Provable Offline Reinforcement Learning for Structured Cyclic MDPs
by: Lee, Kyungbok, et al.
Published: (2026)
by: Lee, Kyungbok, et al.
Published: (2026)
Integrated Offline and Online Learning to Solve a Large Class of Scheduling Problems
by: Liu, Anbang, et al.
Published: (2025)
by: Liu, Anbang, et al.
Published: (2025)
Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL
by: Zhang, Weitong, et al.
Published: (2021)
by: Zhang, Weitong, et al.
Published: (2021)
TaskMet: Task-Driven Metric Learning for Model Learning
by: Bansal, Dishank, et al.
Published: (2023)
by: Bansal, Dishank, et al.
Published: (2023)
On the Power of (Approximate) Reward Models for Inference-Time Scaling
by: Zhu, Youheng, et al.
Published: (2026)
by: Zhu, Youheng, et al.
Published: (2026)
A Control Theoretic Framework for Adaptive Gradient Optimizers in Machine Learning
by: Chakrabarti, Kushal, et al.
Published: (2022)
by: Chakrabarti, Kushal, et al.
Published: (2022)
Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently
by: Calo, Sergio, et al.
Published: (2024)
by: Calo, Sergio, et al.
Published: (2024)
Belief Samples Are All You Need For Social Learning
by: JafariNodeh, Mahyar, et al.
Published: (2024)
by: JafariNodeh, Mahyar, et al.
Published: (2024)
Quantizer Design for Finite Model Approximations, Model Learning, and Quantized Q-Learning for MDPs with Unbounded Spaces
by: Bicer, Osman, et al.
Published: (2025)
by: Bicer, Osman, et al.
Published: (2025)
A Randomized Zeroth-Order Hierarchical Framework for Heterogeneous Federated Learning
by: Qiu, Yuyang, et al.
Published: (2025)
by: Qiu, Yuyang, et al.
Published: (2025)
FERERO: A Flexible Framework for Preference-Guided Multi-Objective Learning
by: Chen, Lisha, et al.
Published: (2024)
by: Chen, Lisha, et al.
Published: (2024)
Similar Items
-
Recurrent Natural Policy Gradient for POMDPs
by: Cayci, Semih, et al.
Published: (2024) -
Residuals-based Offline Reinforcement Learning
by: Zhu, Qing, et al.
Published: (2026) -
Solving Truly Massive Budgeted Monotonic POMDPs with Oracle-Guided Meta-Reinforcement Learning
by: Vora, Manav, et al.
Published: (2024) -
Offline Reinforcement Learning via Inverse Optimization
by: Dimanidis, Ioannis, et al.
Published: (2025) -
Dual Control of Linear Systems from Bilinear Observations with Belief Space Model Predictive Control
by: Cao, Daniel, et al.
Published: (2026)