Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.16329 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Table of Contents:
- Differentially private (DP) linear regression has received significant attention in the recent theoretical literature, with several approaches proposed to improve error rates. Our work considers the popular high-dimensional regime with random data, where the number of training samples $n$ and the input dimension $d$ grow at a proportional rate $d / n \to γ$, and it studies a family of one-pass DP gradient descent (DP-GD) algorithms satisfying $ρ^2 / 2$ zero concentrated DP. In this setting, we establish a deterministic equivalent for the DP-GD trajectory in terms of a system of ordinary differential equations. This allows to analyze the effect of gradient clipping constants that are smaller than the typical norm of the per-sample gradients - a setup shown to improve performance in practice. For well-conditioned data, we show that DP-GD, upon properly choosing clipping constant and learning rate, achieves the non-asymptotic risk of $O(γ+ γ^2 / ρ^2)$, and we establish that this rate is minimax optimal. Then, we consider the ill-conditioned case where the data covariance spectrum follows a power-law distribution, and we show that the risk displays a power-like scaling law in $γ$, highlighting the change in the exponent as a function of the privacy parameter $ρ$. Overall, our analysis demonstrates the benefits of practical algorithmic design choices, including aggressive gradient clipping and decaying learning rate schedules.