Saved in:
Bibliographic Details
Main Authors: Pinzón, Carlos Antonio, ElSalamouny, Ehab, Massot, Lucas, Miller, Alexis, Arcolezi, Héber Hwang, Palamidessi, Catuscia
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.08603
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917199493464064
author Pinzón, Carlos Antonio
ElSalamouny, Ehab
Massot, Lucas
Miller, Alexis
Arcolezi, Héber Hwang
Palamidessi, Catuscia
author_facet Pinzón, Carlos Antonio
ElSalamouny, Ehab
Massot, Lucas
Miller, Alexis
Arcolezi, Héber Hwang
Palamidessi, Catuscia
contents Randomized Response (RR) is a protocol designed to collect and analyze categorical data with local differential privacy guarantees. It has been used as a building block of mechanisms deployed by Big tech companies to collect app or web users' data. Each user reports an automatic random alteration of their true value to the analytics server, which then estimates the histogram of the true unseen values of all users using a debiasing rule to compensate for the added randomness. A known issue is that the standard debiasing rule can yield a vector with negative values (which can not be interpreted as a histogram), and there is no consensus on the best fix. An elegant but slow solution is the Iterative Bayesian Update algorithm (IBU), which converges to the Maximum Likelihood Estimate (MLE) as the number of iterations goes to infinity. This paper bypasses IBU by providing a simple formula for the exact MLE of RR and compares it with other estimation methods experimentally to help practitioners decide which one to use.
format Preprint
id arxiv_https___arxiv_org_abs_2601_08603
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Estimating the True Distribution of Data Collected with Randomized Response
Pinzón, Carlos Antonio
ElSalamouny, Ehab
Massot, Lucas
Miller, Alexis
Arcolezi, Héber Hwang
Palamidessi, Catuscia
Cryptography and Security
Randomized Response (RR) is a protocol designed to collect and analyze categorical data with local differential privacy guarantees. It has been used as a building block of mechanisms deployed by Big tech companies to collect app or web users' data. Each user reports an automatic random alteration of their true value to the analytics server, which then estimates the histogram of the true unseen values of all users using a debiasing rule to compensate for the added randomness. A known issue is that the standard debiasing rule can yield a vector with negative values (which can not be interpreted as a histogram), and there is no consensus on the best fix. An elegant but slow solution is the Iterative Bayesian Update algorithm (IBU), which converges to the Maximum Likelihood Estimate (MLE) as the number of iterations goes to infinity. This paper bypasses IBU by providing a simple formula for the exact MLE of RR and compares it with other estimation methods experimentally to help practitioners decide which one to use.
title Estimating the True Distribution of Data Collected with Randomized Response
topic Cryptography and Security
url https://arxiv.org/abs/2601.08603