Saved in:
Bibliographic Details
Main Authors: Chua, Lynn, Cui, Qiliang, Ghazi, Badih, Harrison, Charlie, Kamath, Pritish, Krichene, Walid, Kumar, Ravi, Manurangsi, Pasin, Narra, Krishna Giri, Sinha, Amer, Varadarajan, Avinash, Zhang, Chiyuan
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2401.15246
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917577375088640
author Chua, Lynn
Cui, Qiliang
Ghazi, Badih
Harrison, Charlie
Kamath, Pritish
Krichene, Walid
Kumar, Ravi
Manurangsi, Pasin
Narra, Krishna Giri
Sinha, Amer
Varadarajan, Avinash
Zhang, Chiyuan
author_facet Chua, Lynn
Cui, Qiliang
Ghazi, Badih
Harrison, Charlie
Kamath, Pritish
Krichene, Walid
Kumar, Ravi
Manurangsi, Pasin
Narra, Krishna Giri
Sinha, Amer
Varadarajan, Avinash
Zhang, Chiyuan
contents Motivated by problems arising in digital advertising, we introduce the task of training differentially private (DP) machine learning models with semi-sensitive features. In this setting, a subset of the features is known to the attacker (and thus need not be protected) while the remaining features as well as the label are unknown to the attacker and should be protected by the DP guarantee. This task interpolates between training the model with full DP (where the label and all features should be protected) or with label DP (where all the features are considered known, and only the label should be protected). We present a new algorithm for training DP models with semi-sensitive features. Through an empirical evaluation on real ads datasets, we demonstrate that our algorithm surpasses in utility the baselines of (i) DP stochastic gradient descent (DP-SGD) run on all features (known and unknown), and (ii) a label DP algorithm run only on the known features (while discarding the unknown ones).
format Preprint
id arxiv_https___arxiv_org_abs_2401_15246
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Training Differentially Private Ad Prediction Models with Semi-Sensitive Features
Chua, Lynn
Cui, Qiliang
Ghazi, Badih
Harrison, Charlie
Kamath, Pritish
Krichene, Walid
Kumar, Ravi
Manurangsi, Pasin
Narra, Krishna Giri
Sinha, Amer
Varadarajan, Avinash
Zhang, Chiyuan
Machine Learning
Cryptography and Security
Information Retrieval
Motivated by problems arising in digital advertising, we introduce the task of training differentially private (DP) machine learning models with semi-sensitive features. In this setting, a subset of the features is known to the attacker (and thus need not be protected) while the remaining features as well as the label are unknown to the attacker and should be protected by the DP guarantee. This task interpolates between training the model with full DP (where the label and all features should be protected) or with label DP (where all the features are considered known, and only the label should be protected). We present a new algorithm for training DP models with semi-sensitive features. Through an empirical evaluation on real ads datasets, we demonstrate that our algorithm surpasses in utility the baselines of (i) DP stochastic gradient descent (DP-SGD) run on all features (known and unknown), and (ii) a label DP algorithm run only on the known features (while discarding the unknown ones).
title Training Differentially Private Ad Prediction Models with Semi-Sensitive Features
topic Machine Learning
Cryptography and Security
Information Retrieval
url https://arxiv.org/abs/2401.15246