Saved in:
| Main Authors: | Qian, Jian, Hu, Haichen, Simchi-Levi, David |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.17796 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Model-Based Reinforcement Learning with Double Oracle Efficiency in Policy Optimization and Offline Estimation
by: Hu, Haichen, et al.
Published: (2026)
by: Hu, Haichen, et al.
Published: (2026)
Interleaved Resampling and Refitting: Data and Compute-Efficient Evaluation of Black-Box Predictors
by: Hu, Haichen, et al.
Published: (2026)
by: Hu, Haichen, et al.
Published: (2026)
Perturbing the Derivative: Doubly Wild Refitting for Model-Free Evaluation of Opaque Machine Learning Predictors
by: Hu, Haichen, et al.
Published: (2025)
by: Hu, Haichen, et al.
Published: (2025)
Perturbing the Derivative: Wild Refitting for Model-Free Evaluation of Machine Learning Models under Bregman Losses
by: Hu, Haichen, et al.
Published: (2025)
by: Hu, Haichen, et al.
Published: (2025)
Pre-Trained AI Model Assisted Online Decision-Making under Missing Covariates: A Theoretical Perspective
by: Hu, Haichen, et al.
Published: (2025)
by: Hu, Haichen, et al.
Published: (2025)
Contextual Online Decision Making with Infinite-Dimensional Functional Regression
by: Hu, Haichen, et al.
Published: (2025)
by: Hu, Haichen, et al.
Published: (2025)
Constrained Online Decision-Making: A Unified Framework
by: Hu, Haichen, et al.
Published: (2025)
by: Hu, Haichen, et al.
Published: (2025)
On the Optimal Regret of Locally Private Linear Contextual Bandit
by: Li, Jiachun, et al.
Published: (2024)
by: Li, Jiachun, et al.
Published: (2024)
Exploration-Exploitation Tradeoff in Universal Lossy Compression
by: Weinberger, Nir, et al.
Published: (2025)
by: Weinberger, Nir, et al.
Published: (2025)
Neural Exploitation and Exploration of Contextual Bandits
by: Ban, Yikun, et al.
Published: (2023)
by: Ban, Yikun, et al.
Published: (2023)
From Confounding to Learning: Dynamic Service Fee Pricing on Third-Party Platforms
by: Ai, Rui, et al.
Published: (2025)
by: Ai, Rui, et al.
Published: (2025)
Learning to Price with Resource Constraints: From Full Information to Machine-Learned Prices
by: Ao, Ruicheng, et al.
Published: (2025)
by: Ao, Ruicheng, et al.
Published: (2025)
Adaptive Estimation and Optimal Control in Offline Contextual MDPs without Stationarity
by: Bhattacharyya, Riddhiman, et al.
Published: (2026)
by: Bhattacharyya, Riddhiman, et al.
Published: (2026)
Sobolev Norm Learning Rates for Conditional Mean Embeddings
by: Talwai, Prem, et al.
Published: (2021)
by: Talwai, Prem, et al.
Published: (2021)
Improving the Estimation of Lifetime Effects in A/B Testing via Treatment Locality
by: Chen, Shuze, et al.
Published: (2024)
by: Chen, Shuze, et al.
Published: (2024)
Optimal Adaptive Experimental Design for Estimating Treatment Effect
by: Li, Jiachun, et al.
Published: (2024)
by: Li, Jiachun, et al.
Published: (2024)
Sample and Oracle Efficient Reinforcement Learning for MDPs with Linearly-Realizable Value Functions
by: Mhammedi, Zakaria
Published: (2024)
by: Mhammedi, Zakaria
Published: (2024)
Navigating the Exploration-Exploitation Tradeoff in Inference-Time Scaling of Diffusion Models
by: Su, Xun, et al.
Published: (2025)
by: Su, Xun, et al.
Published: (2025)
Exploration Implies Data Augmentation: Reachability and Generalisation in Contextual MDPs
by: Weltevrede, Max, et al.
Published: (2024)
by: Weltevrede, Max, et al.
Published: (2024)
Near-Optimal Regret for Policy Optimization in Contextual MDPs with General Offline Function Approximation
by: Levy, Orin, et al.
Published: (2026)
by: Levy, Orin, et al.
Published: (2026)
Taming the Monster Every Context: Complexity Measure and Unified Framework for Offline-Oracle Efficient Contextual Bandits
by: Qin, Hao, et al.
Published: (2026)
by: Qin, Hao, et al.
Published: (2026)
Prediction-Guided Active Experiments
by: Ao, Ruicheng, et al.
Published: (2024)
by: Ao, Ruicheng, et al.
Published: (2024)
Beyond ATE: Multi-Criteria Design for A/B Testing
by: Li, Jiachun, et al.
Published: (2025)
by: Li, Jiachun, et al.
Published: (2025)
Regret-Oracle Complexity Tradeoffs in Agnostic Online Learning
by: Attias, Idan, et al.
Published: (2026)
by: Attias, Idan, et al.
Published: (2026)
A Simple and Optimal Policy Design with Safety against Heavy-Tailed Risk for Stochastic Bandits
by: Simchi-Levi, David, et al.
Published: (2022)
by: Simchi-Levi, David, et al.
Published: (2022)
Partial Identification under Missing Data Using Weak Shadow Variables from Pretrained Models
by: Chen, Hongyu, et al.
Published: (2026)
by: Chen, Hongyu, et al.
Published: (2026)
The Value of Information in Resource-Constrained Pricing
by: Ao, Ruicheng, et al.
Published: (2026)
by: Ao, Ruicheng, et al.
Published: (2026)
Privacy Preserving Adaptive Experiment Design
by: Li, Jiachun, et al.
Published: (2024)
by: Li, Jiachun, et al.
Published: (2024)
Regret Distribution in Stochastic Bandits: Optimal Trade-off between Expectation and Tail Risk
by: Simchi-Levi, David, et al.
Published: (2023)
by: Simchi-Levi, David, et al.
Published: (2023)
On the Reliability Limits of LLM-Based Multi-Agent Planning
by: Ao, Ruicheng, et al.
Published: (2026)
by: Ao, Ruicheng, et al.
Published: (2026)
ORLoopBench: Solver-in-the-Loop Benchmarks for Self-Correction and Behavioral Rationality in Operations Research
by: Ao, Ruicheng, et al.
Published: (2026)
by: Ao, Ruicheng, et al.
Published: (2026)
OptiRepair: Closed-Loop Diagnosis and Repair of Supply Chain Optimization Models with LLM Agents
by: Ao, Ruicheng, et al.
Published: (2026)
by: Ao, Ruicheng, et al.
Published: (2026)
Is Pure Exploitation Sufficient in Exogenous MDPs with Linear Function Approximation?
by: Liang, Hao, et al.
Published: (2026)
by: Liang, Hao, et al.
Published: (2026)
Exploitation Over Exploration: Unmasking the Bias in Linear Bandit Recommender Offline Evaluation
by: Pires, Pedro R., et al.
Published: (2025)
by: Pires, Pedro R., et al.
Published: (2025)
Offline-Online Reinforcement Learning for Linear Mixture MDPs
by: Zhang, Zhongjun, et al.
Published: (2026)
by: Zhang, Zhongjun, et al.
Published: (2026)
Blind Network Revenue Management and Bandits with Knapsacks under Limited Switches
by: Simchi-Levi, David, et al.
Published: (2019)
by: Simchi-Levi, David, et al.
Published: (2019)
Efficient Model-Free Exploration in Low-Rank MDPs
by: Mhammedi, Zakaria, et al.
Published: (2023)
by: Mhammedi, Zakaria, et al.
Published: (2023)
Sample Complexity Characterization for Linear Contextual MDPs
by: Deng, Junze, et al.
Published: (2024)
by: Deng, Junze, et al.
Published: (2024)
Eluder-based Regret for Stochastic Contextual MDPs
by: Levy, Orin, et al.
Published: (2022)
by: Levy, Orin, et al.
Published: (2022)
Provable Offline Reinforcement Learning for Structured Cyclic MDPs
by: Lee, Kyungbok, et al.
Published: (2026)
by: Lee, Kyungbok, et al.
Published: (2026)
Similar Items
-
Model-Based Reinforcement Learning with Double Oracle Efficiency in Policy Optimization and Offline Estimation
by: Hu, Haichen, et al.
Published: (2026) -
Interleaved Resampling and Refitting: Data and Compute-Efficient Evaluation of Black-Box Predictors
by: Hu, Haichen, et al.
Published: (2026) -
Perturbing the Derivative: Doubly Wild Refitting for Model-Free Evaluation of Opaque Machine Learning Predictors
by: Hu, Haichen, et al.
Published: (2025) -
Perturbing the Derivative: Wild Refitting for Model-Free Evaluation of Machine Learning Models under Bregman Losses
by: Hu, Haichen, et al.
Published: (2025) -
Pre-Trained AI Model Assisted Online Decision-Making under Missing Covariates: A Theoretical Perspective
by: Hu, Haichen, et al.
Published: (2025)