Saved in:
| Main Authors: | Liang, Hao, Cheng, Jiayu, Sinclair, Sean R., Du, Yali |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.20694 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Offline-Online Reinforcement Learning for Linear Mixture MDPs
by: Zhang, Zhongjun, et al.
Published: (2026)
by: Zhang, Zhongjun, et al.
Published: (2026)
Exploiting Exogenous Structure for Sample-Efficient Reinforcement Learning
by: Wan, Jia, et al.
Published: (2024)
by: Wan, Jia, et al.
Published: (2024)
Sample Complexity Characterization for Linear Contextual MDPs
by: Deng, Junze, et al.
Published: (2024)
by: Deng, Junze, et al.
Published: (2024)
Reinforcement Learning in MDPs with Information-Ordered Policies
by: Zhang, Zhongjun, et al.
Published: (2025)
by: Zhang, Zhongjun, et al.
Published: (2025)
Reinforcement Learning for Infinite-Horizon Average-Reward Linear MDPs via Approximation by Discounted-Reward MDPs
by: Hong, Kihyuk, et al.
Published: (2024)
by: Hong, Kihyuk, et al.
Published: (2024)
Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation
by: Kitamura, Toshinori, et al.
Published: (2025)
by: Kitamura, Toshinori, et al.
Published: (2025)
Two-Timescale Critic-Actor for Average Reward MDPs with Function Approximation
by: Panda, Prashansa, et al.
Published: (2024)
by: Panda, Prashansa, et al.
Published: (2024)
Offline Oracle-Efficient Learning for Contextual MDPs via Layerwise Exploration-Exploitation Tradeoff
by: Qian, Jian, et al.
Published: (2024)
by: Qian, Jian, et al.
Published: (2024)
Near-Optimal Regret for Policy Optimization in Contextual MDPs with General Offline Function Approximation
by: Levy, Orin, et al.
Published: (2026)
by: Levy, Orin, et al.
Published: (2026)
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation
by: He, Jianliang, et al.
Published: (2024)
by: He, Jianliang, et al.
Published: (2024)
Exogenous Isomorphism for Counterfactual Identifiability
by: Chen, Yikang, et al.
Published: (2025)
by: Chen, Yikang, et al.
Published: (2025)
The Data-Driven Censored Newsvendor Problem
by: Hssaine, Chamsi, et al.
Published: (2024)
by: Hssaine, Chamsi, et al.
Published: (2024)
Sample and Oracle Efficient Reinforcement Learning for MDPs with Linearly-Realizable Value Functions
by: Mhammedi, Zakaria
Published: (2024)
by: Mhammedi, Zakaria
Published: (2024)
Demystifying Linear MDPs and Novel Dynamics Aggregation Framework
by: Lee, Joongkyu, et al.
Published: (2024)
by: Lee, Joongkyu, et al.
Published: (2024)
Misspecified $Q$-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error
by: Du, Ally Yalei, et al.
Published: (2024)
by: Du, Ally Yalei, et al.
Published: (2024)
Nonstationary Reinforcement Learning with Linear Function Approximation
by: Zhou, Huozhi, et al.
Published: (2020)
by: Zhou, Huozhi, et al.
Published: (2020)
Local Linearity: the Key for No-regret Reinforcement Learning in Continuous MDPs
by: Maran, Davide, et al.
Published: (2024)
by: Maran, Davide, et al.
Published: (2024)
Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs
by: Li, Long-Fei, et al.
Published: (2024)
by: Li, Long-Fei, et al.
Published: (2024)
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
by: Cassel, Asaf, et al.
Published: (2024)
by: Cassel, Asaf, et al.
Published: (2024)
Imitation Learning in Discounted Linear MDPs without exploration assumptions
by: Viano, Luca, et al.
Published: (2024)
by: Viano, Luca, et al.
Published: (2024)
Causality Meets Locality: Provably Generalizable and Scalable Policy Learning for Networked Systems
by: Liang, Hao, et al.
Published: (2025)
by: Liang, Hao, et al.
Published: (2025)
Invariant Learning via Probability of Sufficient and Necessary Causes
by: Yang, Mengyue, et al.
Published: (2023)
by: Yang, Mengyue, et al.
Published: (2023)
Refined Sample Complexity for Markov Games with Independent Linear Function Approximation
by: Dai, Yan, et al.
Published: (2024)
by: Dai, Yan, et al.
Published: (2024)
Addressing Finite-Horizon MDPs via Low-Rank Tensor Value Approximation
by: Rozada, Sergio, et al.
Published: (2025)
by: Rozada, Sergio, et al.
Published: (2025)
Sample Complexity Bounds for Linear Constrained MDPs with a Generative Model
by: Liu, Xingtu, et al.
Published: (2025)
by: Liu, Xingtu, et al.
Published: (2025)
Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs
by: Tan, Kevin, et al.
Published: (2024)
by: Tan, Kevin, et al.
Published: (2024)
CrossLinear: Plug-and-Play Cross-Correlation Embedding for Time Series Forecasting with Exogenous Variables
by: Zhou, Pengfei, et al.
Published: (2025)
by: Zhou, Pengfei, et al.
Published: (2025)
Exogenous Matching: Learning Good Proposals for Tractable Counterfactual Estimation
by: Chen, Yikang, et al.
Published: (2024)
by: Chen, Yikang, et al.
Published: (2024)
Linear Convergence of Entropy-Regularized Natural Policy Gradient with Linear Function Approximation
by: Cayci, Semih, et al.
Published: (2021)
by: Cayci, Semih, et al.
Published: (2021)
Select, then Balance: Exploring Exogenous Variable Modeling of Spatio-Temporal Forecasting
by: Chen, Wei, et al.
Published: (2025)
by: Chen, Wei, et al.
Published: (2025)
Replicable Reinforcement Learning with Linear Function Approximation
by: Eaton, Eric, et al.
Published: (2025)
by: Eaton, Eric, et al.
Published: (2025)
Pure Exploration in Bandits with Linear Constraints
by: Carlsson, Emil, et al.
Published: (2023)
by: Carlsson, Emil, et al.
Published: (2023)
End-to-End Efficient RL for Linear Bellman Complete MDPs with Deterministic Transitions
by: Mhammedi, Zakaria, et al.
Published: (2026)
by: Mhammedi, Zakaria, et al.
Published: (2026)
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Linear MDPs
by: Hong, Kihyuk, et al.
Published: (2024)
by: Hong, Kihyuk, et al.
Published: (2024)
Improved Algorithm for Adversarial Linear Mixture MDPs with Bandit Feedback and Unknown Transition
by: Li, Long-Fei, et al.
Published: (2024)
by: Li, Long-Fei, et al.
Published: (2024)
Non-Stationary Inventory Control with Lead Times
by: Amiri, Nele H., et al.
Published: (2026)
by: Amiri, Nele H., et al.
Published: (2026)
Towards Optimal Differentially Private Regret Bounds in Linear MDPs
by: Sahu, Sharan
Published: (2025)
by: Sahu, Sharan
Published: (2025)
Efficient, Low-Regret, Online Reinforcement Learning for Linear MDPs
by: John, Philips George, et al.
Published: (2024)
by: John, Philips George, et al.
Published: (2024)
Deep Neural Networks are Adaptive to Function Regularity and Data Distribution in Approximation and Estimation
by: Liu, Hao, et al.
Published: (2024)
by: Liu, Hao, et al.
Published: (2024)
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs
by: Zhang, Junkai, et al.
Published: (2023)
by: Zhang, Junkai, et al.
Published: (2023)
Similar Items
-
Offline-Online Reinforcement Learning for Linear Mixture MDPs
by: Zhang, Zhongjun, et al.
Published: (2026) -
Exploiting Exogenous Structure for Sample-Efficient Reinforcement Learning
by: Wan, Jia, et al.
Published: (2024) -
Sample Complexity Characterization for Linear Contextual MDPs
by: Deng, Junze, et al.
Published: (2024) -
Reinforcement Learning in MDPs with Information-Ordered Policies
by: Zhang, Zhongjun, et al.
Published: (2025) -
Reinforcement Learning for Infinite-Horizon Average-Reward Linear MDPs via Approximation by Discounted-Reward MDPs
by: Hong, Kihyuk, et al.
Published: (2024)