Guardado en:
Detalles Bibliográficos
Autores principales: Kong, Weiwei, Ribero, Mónica
Formato: Preprint
Publicado: 2024
Materias:
Acceso en línea:https://arxiv.org/abs/2407.05237
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866915143931133952
author Kong, Weiwei
Ribero, Mónica
author_facet Kong, Weiwei
Ribero, Mónica
contents Differentially-private stochastic gradient descent (DP-SGD) is a family of iterative machine learning training algorithms that privatize gradients to generate a sequence of differentially-private (DP) model parameters. It is also the standard tool used to train DP models in practice, even though most users are only interested in protecting the privacy of the final model. Tight DP accounting for the last iterate would minimize the amount of noise required while maintaining the same privacy guarantee and potentially increasing model utility. However, last-iterate accounting is challenging, and existing works require strong assumptions not satisfied by most implementations. These include assuming (i) the global sensitivity constant is known - to avoid gradient clipping; (ii) the loss function is Lipschitz or convex; and (iii) input batches are sampled randomly. In this work, we forego any unrealistic assumptions and provide privacy bounds for the most commonly used variant of DP-SGD, in which data is traversed cyclically, gradients are clipped, and only the last model is released. More specifically, we establish new Renyi differential privacy (RDP) upper bounds for the last iterate under realistic assumptions of small stepsize and Lipschitz smoothness of the loss function. Our general bounds also recover the special-case convex bounds when the weak-convexity parameter of the objective function approaches zero and no clipping is performed. The approach itself leverages optimal transport techniques for last iterate bounds, which is a nontrivial task when the data is traversed cyclically and the loss function is nonconvex.
format Preprint
id arxiv_https___arxiv_org_abs_2407_05237
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Privacy of the last iterate in cyclically-sampled DP-SGD on nonconvex composite losses
Kong, Weiwei
Ribero, Mónica
Machine Learning
Cryptography and Security
Data Structures and Algorithms
Optimization and Control
65K10 (Primary), 60G15, 68P27
G.3; G.1.6
Differentially-private stochastic gradient descent (DP-SGD) is a family of iterative machine learning training algorithms that privatize gradients to generate a sequence of differentially-private (DP) model parameters. It is also the standard tool used to train DP models in practice, even though most users are only interested in protecting the privacy of the final model. Tight DP accounting for the last iterate would minimize the amount of noise required while maintaining the same privacy guarantee and potentially increasing model utility. However, last-iterate accounting is challenging, and existing works require strong assumptions not satisfied by most implementations. These include assuming (i) the global sensitivity constant is known - to avoid gradient clipping; (ii) the loss function is Lipschitz or convex; and (iii) input batches are sampled randomly. In this work, we forego any unrealistic assumptions and provide privacy bounds for the most commonly used variant of DP-SGD, in which data is traversed cyclically, gradients are clipped, and only the last model is released. More specifically, we establish new Renyi differential privacy (RDP) upper bounds for the last iterate under realistic assumptions of small stepsize and Lipschitz smoothness of the loss function. Our general bounds also recover the special-case convex bounds when the weak-convexity parameter of the objective function approaches zero and no clipping is performed. The approach itself leverages optimal transport techniques for last iterate bounds, which is a nontrivial task when the data is traversed cyclically and the loss function is nonconvex.
title Privacy of the last iterate in cyclically-sampled DP-SGD on nonconvex composite losses
topic Machine Learning
Cryptography and Security
Data Structures and Algorithms
Optimization and Control
65K10 (Primary), 60G15, 68P27
G.3; G.1.6
url https://arxiv.org/abs/2407.05237