Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ou, Tingting, Medina, Marco Avella, Cummings, Rachel
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Data Structures and Algorithms
Online Access:	https://arxiv.org/abs/2407.14879
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911963634728960
author	Ou, Tingting Medina, Marco Avella Cummings, Rachel
author_facet	Ou, Tingting Medina, Marco Avella Cummings, Rachel
contents	In this work we first show that the classical Thompson sampling algorithm for multi-arm bandits is differentially private as-is, without any modification. We provide per-round privacy guarantees as a function of problem parameters and show composition over $T$ rounds; since the algorithm is unchanged, existing $O(\sqrt{NT\log N})$ regret bounds still hold and there is no loss in performance due to privacy. We then show that simple modifications -- such as pre-pulling all arms a fixed number of times, increasing the sampling variance -- can provide tighter privacy guarantees. We again provide privacy guarantees that now depend on the new parameters introduced in the modification, which allows the analyst to tune the privacy guarantee as desired. We also provide a novel regret analysis for this new algorithm, and show how the new parameters also impact expected regret. Finally, we empirically validate and illustrate our theoretical findings in two parameter regimes and demonstrate that tuning the new parameters substantially improve the privacy-regret tradeoff.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_14879
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Thompson Sampling Itself is Differentially Private Ou, Tingting Medina, Marco Avella Cummings, Rachel Machine Learning Data Structures and Algorithms In this work we first show that the classical Thompson sampling algorithm for multi-arm bandits is differentially private as-is, without any modification. We provide per-round privacy guarantees as a function of problem parameters and show composition over $T$ rounds; since the algorithm is unchanged, existing $O(\sqrt{NT\log N})$ regret bounds still hold and there is no loss in performance due to privacy. We then show that simple modifications -- such as pre-pulling all arms a fixed number of times, increasing the sampling variance -- can provide tighter privacy guarantees. We again provide privacy guarantees that now depend on the new parameters introduced in the modification, which allows the analyst to tune the privacy guarantee as desired. We also provide a novel regret analysis for this new algorithm, and show how the new parameters also impact expected regret. Finally, we empirically validate and illustrate our theoretical findings in two parameter regimes and demonstrate that tuning the new parameters substantially improve the privacy-regret tradeoff.
title	Thompson Sampling Itself is Differentially Private
topic	Machine Learning Data Structures and Algorithms
url	https://arxiv.org/abs/2407.14879

Similar Items