Abstract
We present differentially private algorithms for high-dimensional mean estimation. Previous private estimators on distributions over Rd suffer from a curse of dimensionality, as they require Ω(d1/2) samples to achieve non-trivial error, even in cases where O(1) samples suffice without privacy. This rate is unavoidable when the distribution is isotropic, namely, when the covariance is a multiple of the identity matrix. Yet, real-world data is often highly anisotropic, with signals concentrated on a small number of principal components. We develop estimators that are appropriate for such signals-our estimators are (ε, δ)-differentially private and have sample complexity that is dimension-independent for anisotropic subgaussian distributions. Given n samples from a distribution with known covariance-proxy Σ and unknown mean µ, we present an estimator µ̂ that achieves error, ∥µ̂−µ∥2 ≤ α, as long as n ≳ tr(Σ)/α2 + tr(Σ1/2)/(αε). We show that this is the optimal sample complexity for this task up to logarithmic factors.
Original language | English |
---|---|
Journal | Advances in Neural Information Processing Systems |
Volume | 37 |
State | Published - 2024 |
Event | 38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, Canada Duration: 9 Dec 2024 → 15 Dec 2024 |
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Information Systems
- Signal Processing