Saved in:
Bibliographic Details
Main Author: Gershoff, Matthew
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.14329
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916581734350848
author Gershoff, Matthew
author_facet Gershoff, Matthew
contents A core principle of Privacy by Design (PbD) is minimizing the data that is stored or shared about each individual respondent. PbD principles are mandated by the GDPR (see Article 5c and Article 25), as well as informing aspects of California Privacy Rights Act (CPRA). This paper describes a simple and effective approach that can be used in many a/b testing and similar contexts to help meet these PbD goals. Specifically, the method presented describes an approach to run OLS regression on k-anonymized data. To help illustrate the general utility of this approach, descriptions of two important use cases are offered: 1) calculating partial f-tests as a simple way to both check for a/b test interactions and to test for heterogeneity of treatment effects; and 2) regression adjustment using an approach similar to the popular CUPED method, as a variance reduction method for a/b tests. Using this method has advantages for privacy and compliance, as well as often reducing data storage and processing costs, by storing, sharing, or analyzing only aggregate level rather than individual level data.
format Preprint
id arxiv_https___arxiv_org_abs_2501_14329
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle K-Anonymous A/B Testing
Gershoff, Matthew
Applications
63
G.3
A core principle of Privacy by Design (PbD) is minimizing the data that is stored or shared about each individual respondent. PbD principles are mandated by the GDPR (see Article 5c and Article 25), as well as informing aspects of California Privacy Rights Act (CPRA). This paper describes a simple and effective approach that can be used in many a/b testing and similar contexts to help meet these PbD goals. Specifically, the method presented describes an approach to run OLS regression on k-anonymized data. To help illustrate the general utility of this approach, descriptions of two important use cases are offered: 1) calculating partial f-tests as a simple way to both check for a/b test interactions and to test for heterogeneity of treatment effects; and 2) regression adjustment using an approach similar to the popular CUPED method, as a variance reduction method for a/b tests. Using this method has advantages for privacy and compliance, as well as often reducing data storage and processing costs, by storing, sharing, or analyzing only aggregate level rather than individual level data.
title K-Anonymous A/B Testing
topic Applications
63
G.3
url https://arxiv.org/abs/2501.14329