Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Shi, Liang, Bakhitov, Edvard, Hung, Kenneth, Karrer, Brian, Walker, Charlie, Bhole, Monica, Schrijvers, Okke
Format:	Preprint
Published:	2024
Subjects:	Methodology
Online Access:	https://arxiv.org/abs/2402.11070
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908627192774656
author	Shi, Liang Bakhitov, Edvard Hung, Kenneth Karrer, Brian Walker, Charlie Bhole, Monica Schrijvers, Okke
author_facet	Shi, Liang Bakhitov, Edvard Hung, Kenneth Karrer, Brian Walker, Charlie Bhole, Monica Schrijvers, Okke
contents	Bipartite Experiments are randomized experiments where the treatment is applied to a set of units (randomization units) that is different from the units of analysis, and randomization units and analysis units are connected through a bipartite graph. The scale of experimentation at large online platforms necessitates both accurate inference in the presence of a large bipartite interference graph, as well as a highly scalable implementation. In this paper, we describe new methods for inference that enable practical, scalable analysis of bipartite experiments: (1) We propose CA-ERL, a covariate-adjusted variant of the exposure-reweighted-linear (ERL) estimator [9], which empirically yields 60-90% variance reduction. (2) We introduce a randomization-based method for inference and prove asymptotic validity of a Wald-type confidence interval under graph sparsity assumptions. (3) We present a linear-time algorithm for randomization inference of the CA-ERL estimator, which can be easily implemented in query engines like Presto or Spark. We evaluate our methods both on a real experiment at Meta that randomized treatment on Facebook Groups and analyzed user-level metrics, as well as simulations on synthetic data. The real-world data shows that our CA-ERL estimator reduces the confidence interval (CI) width by 60-90% (compared to ERL) in a practical setting. The simulations using synthetic data show that our randomization inference procedure achieves correct coverage across instances, while the ERL estimator has incorrectly small CI widths for instances with large true effect sizes and is overly conservative when the bipartite graph is dense.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_11070
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Scalable Analysis of Bipartite Experiments Shi, Liang Bakhitov, Edvard Hung, Kenneth Karrer, Brian Walker, Charlie Bhole, Monica Schrijvers, Okke Methodology Bipartite Experiments are randomized experiments where the treatment is applied to a set of units (randomization units) that is different from the units of analysis, and randomization units and analysis units are connected through a bipartite graph. The scale of experimentation at large online platforms necessitates both accurate inference in the presence of a large bipartite interference graph, as well as a highly scalable implementation. In this paper, we describe new methods for inference that enable practical, scalable analysis of bipartite experiments: (1) We propose CA-ERL, a covariate-adjusted variant of the exposure-reweighted-linear (ERL) estimator [9], which empirically yields 60-90% variance reduction. (2) We introduce a randomization-based method for inference and prove asymptotic validity of a Wald-type confidence interval under graph sparsity assumptions. (3) We present a linear-time algorithm for randomization inference of the CA-ERL estimator, which can be easily implemented in query engines like Presto or Spark. We evaluate our methods both on a real experiment at Meta that randomized treatment on Facebook Groups and analyzed user-level metrics, as well as simulations on synthetic data. The real-world data shows that our CA-ERL estimator reduces the confidence interval (CI) width by 60-90% (compared to ERL) in a practical setting. The simulations using synthetic data show that our randomization inference procedure achieves correct coverage across instances, while the ERL estimator has incorrectly small CI widths for instances with large true effect sizes and is overly conservative when the bipartite graph is dense.
title	Scalable Analysis of Bipartite Experiments
topic	Methodology
url	https://arxiv.org/abs/2402.11070

Similar Items