Saved in:
Bibliographic Details
Main Authors: Bokelmann, Björn, Lessmann, Stefan
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2401.14294
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913209393348608
author Bokelmann, Björn
Lessmann, Stefan
author_facet Bokelmann, Björn
Lessmann, Stefan
contents In many business applications, including online marketing and customer churn prevention, randomized controlled trials (RCT's) are conducted to investigate on the effect of specific treatment (coupon offers, advertisement mailings,...). Such RCT's allow for the estimation of average treatment effects as well as the training of (uplift) models for the heterogeneity of treatment effects between individuals. The problem with these RCT's is that they are costly and this cost increases with the number of individuals included into the RCT. For this reason, there is research how to conduct experiments involving a small number of individuals while still obtaining precise treatment effect estimates. We contribute to this literature a heteroskedasticity-aware stratified sampling (HS) scheme, which leverages the fact that different individuals have different noise levels in their outcome and precise treatment effect estimation requires more observations from the "high-noise" individuals than from the "low-noise" individuals. By theory as well as by empirical experiments, we demonstrate that our HS-sampling yields significantly more precise estimates of the ATE, improves uplift models and makes their evaluation more reliable compared to RCT data sampled completely randomly. Due to the relative ease of application and the significant benefits, we expect HS-sampling to be valuable in many real-world applications.
format Preprint
id arxiv_https___arxiv_org_abs_2401_14294
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Heteroscedasticity-aware stratified sampling to improve uplift modeling
Bokelmann, Björn
Lessmann, Stefan
Methodology
In many business applications, including online marketing and customer churn prevention, randomized controlled trials (RCT's) are conducted to investigate on the effect of specific treatment (coupon offers, advertisement mailings,...). Such RCT's allow for the estimation of average treatment effects as well as the training of (uplift) models for the heterogeneity of treatment effects between individuals. The problem with these RCT's is that they are costly and this cost increases with the number of individuals included into the RCT. For this reason, there is research how to conduct experiments involving a small number of individuals while still obtaining precise treatment effect estimates. We contribute to this literature a heteroskedasticity-aware stratified sampling (HS) scheme, which leverages the fact that different individuals have different noise levels in their outcome and precise treatment effect estimation requires more observations from the "high-noise" individuals than from the "low-noise" individuals. By theory as well as by empirical experiments, we demonstrate that our HS-sampling yields significantly more precise estimates of the ATE, improves uplift models and makes their evaluation more reliable compared to RCT data sampled completely randomly. Due to the relative ease of application and the significant benefits, we expect HS-sampling to be valuable in many real-world applications.
title Heteroscedasticity-aware stratified sampling to improve uplift modeling
topic Methodology
url https://arxiv.org/abs/2401.14294