Saved in:
| Main Authors: | Lee, Kyungbok, Paik, Myunghee Cho |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.01830 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Log-Sum-Exponential Estimator for Off-Policy Evaluation and Learning
by: Behnamnia, Armin, et al.
Published: (2025)
by: Behnamnia, Armin, et al.
Published: (2025)
Off-Policy Evaluation for Ranking Policies under Deterministic Logging Policies
by: Tanaka, Koichi, et al.
Published: (2026)
by: Tanaka, Koichi, et al.
Published: (2026)
Off-Policy Evaluation from Logged Human Feedback
by: Bhargava, Aniruddha, et al.
Published: (2024)
by: Bhargava, Aniruddha, et al.
Published: (2024)
Doubly-Robust Estimation of Counterfactual Policy Mean Embeddings
by: Zenati, Houssam, et al.
Published: (2025)
by: Zenati, Houssam, et al.
Published: (2025)
Doubly Robust Interval Estimation for Optimal Policy Evaluation in Online Learning
by: Shen, Ye, et al.
Published: (2021)
by: Shen, Ye, et al.
Published: (2021)
Logging Policy Design for Off-Policy Evaluation
by: Douglas, Connor, et al.
Published: (2026)
by: Douglas, Connor, et al.
Published: (2026)
Doubly Optimal Policy Evaluation for Reinforcement Learning
by: Liu, Shuze Daniel, et al.
Published: (2024)
by: Liu, Shuze Daniel, et al.
Published: (2024)
Doubly Robust Fusion of Many Treatments for Policy Learning
by: Zhu, Ke, et al.
Published: (2025)
by: Zhu, Ke, et al.
Published: (2025)
CANDOR: Counterfactual ANnotated DOubly Robust Off-Policy Evaluation
by: Mandyam, Aishwarya, et al.
Published: (2024)
by: Mandyam, Aishwarya, et al.
Published: (2024)
Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies
by: Lee, Haanvid, et al.
Published: (2024)
by: Lee, Haanvid, et al.
Published: (2024)
Meta Off-Policy Estimation
by: Jeunen, Olivier
Published: (2025)
by: Jeunen, Olivier
Published: (2025)
Robustness of Refugee-Matching Gains to Off-Policy Evaluation Choices
by: Bansak, Kirk, et al.
Published: (2026)
by: Bansak, Kirk, et al.
Published: (2026)
Efficient Training of Boltzmann Generators Using Off-Policy Log-Dispersion Regularization
by: Schopmans, Henrik, et al.
Published: (2026)
by: Schopmans, Henrik, et al.
Published: (2026)
From Weighting to Modeling: A Nonparametric Estimator for Off-Policy Evaluation
by: Zhu, Rong J. B.
Published: (2026)
by: Zhu, Rong J. B.
Published: (2026)
Cross-Validated Off-Policy Evaluation
by: Cief, Matej, et al.
Published: (2024)
by: Cief, Matej, et al.
Published: (2024)
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
by: Zhou, Hongyi, et al.
Published: (2025)
by: Zhou, Hongyi, et al.
Published: (2025)
Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction
by: Kiyohara, Haruka, et al.
Published: (2024)
by: Kiyohara, Haruka, et al.
Published: (2024)
Context-Action Embedding Learning for Off-Policy Evaluation in Contextual Bandits
by: Chandak, Kushagra, et al.
Published: (2025)
by: Chandak, Kushagra, et al.
Published: (2025)
Off-policy Evaluation in Doubly Inhomogeneous Environments
by: Bian, Zeyu, et al.
Published: (2023)
by: Bian, Zeyu, et al.
Published: (2023)
Efficient and Sharp Off-Policy Evaluation in Robust Markov Decision Processes
by: Bennett, Andrew, et al.
Published: (2024)
by: Bennett, Andrew, et al.
Published: (2024)
$Δ\text{-}{\rm OPE}$: Off-Policy Estimation with Pairs of Policies
by: Jeunen, Olivier, et al.
Published: (2024)
by: Jeunen, Olivier, et al.
Published: (2024)
Concept-driven Off Policy Evaluation
by: Majumdar, Ritam, et al.
Published: (2024)
by: Majumdar, Ritam, et al.
Published: (2024)
Clustering Context in Off-Policy Evaluation
by: Guzman-Olivares, Daniel, et al.
Published: (2025)
by: Guzman-Olivares, Daniel, et al.
Published: (2025)
Long-term Off-Policy Evaluation and Learning
by: Saito, Yuta, et al.
Published: (2024)
by: Saito, Yuta, et al.
Published: (2024)
Data Poisoning Attacks on Off-Policy Policy Evaluation Methods
by: Lobo, Elita, et al.
Published: (2024)
by: Lobo, Elita, et al.
Published: (2024)
Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training via Symmetric Policy Evaluation
by: Nakanishi, Kosuke, et al.
Published: (2025)
by: Nakanishi, Kosuke, et al.
Published: (2025)
Off-Policy Evaluation of Ranking Policies via Embedding-Space User Behavior Modeling
by: Takahashi, Tatsuki, et al.
Published: (2025)
by: Takahashi, Tatsuki, et al.
Published: (2025)
Off-Policy Evaluation for Recommendations with Missing-Not-At-Random Rewards
by: Takahashi, Tatsuki, et al.
Published: (2025)
by: Takahashi, Tatsuki, et al.
Published: (2025)
Off-Policy Evaluation Under Nonignorable Missing Data
by: Wang, Han, et al.
Published: (2025)
by: Wang, Han, et al.
Published: (2025)
Off-Policy Evaluation and Learning for Matching Markets
by: Hayashi, Yudai, et al.
Published: (2025)
by: Hayashi, Yudai, et al.
Published: (2025)
Learning Action Embeddings for Off-Policy Evaluation
by: Cief, Matej, et al.
Published: (2023)
by: Cief, Matej, et al.
Published: (2023)
Exploiting Similarities in A/B Testing with Off-Policy Estimation
by: Sakhi, Otmane, et al.
Published: (2025)
by: Sakhi, Otmane, et al.
Published: (2025)
When Do Off-Policy and On-Policy Policy Gradient Methods Align?
by: Mambelli, Davide, et al.
Published: (2024)
by: Mambelli, Davide, et al.
Published: (2024)
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
by: Liu, Weidong, et al.
Published: (2023)
by: Liu, Weidong, et al.
Published: (2023)
Revisiting Group Relative Policy Optimization: Insights into On-Policy and Off-Policy Training
by: Mroueh, Youssef, et al.
Published: (2025)
by: Mroueh, Youssef, et al.
Published: (2025)
Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning
by: Sakhi, Otmane, et al.
Published: (2024)
by: Sakhi, Otmane, et al.
Published: (2024)
Breaking the Curse of Repulsion: Optimistic Distributionally Robust Policy Optimization for Off-Policy Generative Recommendation
by: Jiang, Jie, et al.
Published: (2026)
by: Jiang, Jie, et al.
Published: (2026)
Behaviour Policy Optimization: Provably Lower Variance Return Estimates for Off-Policy Reinforcement Learning
by: Goodall, Alexander W., et al.
Published: (2025)
by: Goodall, Alexander W., et al.
Published: (2025)
IntOPE: Off-Policy Evaluation in the Presence of Interference
by: Bai, Yuqi, et al.
Published: (2024)
by: Bai, Yuqi, et al.
Published: (2024)
Automated Off-Policy Estimator Selection via Supervised Learning
by: Felicioni, Nicolò, et al.
Published: (2024)
by: Felicioni, Nicolò, et al.
Published: (2024)
Similar Items
-
Log-Sum-Exponential Estimator for Off-Policy Evaluation and Learning
by: Behnamnia, Armin, et al.
Published: (2025) -
Off-Policy Evaluation for Ranking Policies under Deterministic Logging Policies
by: Tanaka, Koichi, et al.
Published: (2026) -
Off-Policy Evaluation from Logged Human Feedback
by: Bhargava, Aniruddha, et al.
Published: (2024) -
Doubly-Robust Estimation of Counterfactual Policy Mean Embeddings
by: Zenati, Houssam, et al.
Published: (2025) -
Doubly Robust Interval Estimation for Optimal Policy Evaluation in Online Learning
by: Shen, Ye, et al.
Published: (2021)