Abstract
We consider the challenge of differentially private PCA. Currently known methods for this task either employ the computationally intensive exponential mechanism or require an access to the covariance matrix, and therefore fail to utilize potential sparsity of the data. The problem of designing simpler and more efficient methods for this task has been raised as an open problem in Kapralov and Talwar (2013). In this paper we address this problem by employing the output perturbation mechanism. Despite being arguably the simplest and most straightforward technique, it has been overlooked due to the large global sensitivity associated with publishing the leading eigenvector. We tackle this issue by adopting a smooth sensitivity based approach, which allows us to establish differential privacy (in a worst-case manner) and near-optimal sample complexity results under eigengap assumption. We consider both the pure and the approximate notions of differential privacy, and demonstrate a tradeoff between privacy level and sample complexity. We conclude by suggesting how our results can be extended to related problems.
Original language | English |
---|---|
Pages (from-to) | 438-450 |
Number of pages | 13 |
Journal | Proceedings of Machine Learning Research |
Volume | 83 |
State | Published - 2018 |
Externally published | Yes |
Event | 29th International Conference on Algorithmic Learning Theory, ALT 2018 - Lanzarote, Spain Duration: 7 Apr 2018 → 9 Apr 2018 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability