Differentially Private Equivalence Testing for Continuous Distributions and Applications

Daniel Omer, Or Sheffet

Research output: Contribution to journalConference articlepeer-review

Abstract

We present the first algorithm for testing equivalence between two continuous distributions using differential privacy (DP). Our algorithm is a private version of the algorithm of Diakonikolas et al [16]. The algorithm of [16] uses the data itself to repeatedly discretize the real line so that - when the two distributions are far apart in Ak-norm - one of the discretized distributions exhibits large L2-norm difference; and upon repeated sampling such large gap would be detected. Designing its private analogue poses two difficulties. First, our DP algorithm can not resample new datapoints as a change to a single datapoint may lead to a very large change in the descretization of the real line. In contrast, the (sorted) index of the discretization point changes only by 1 between neighboring instances, and so we use a novel algorithm that set the discretization points using random Bernoulli noise, resulting in only a few buckets being affected under the right coupling. Second, our algorithm, which doesn't resample data, requires we also revisit the utility analysis of the original algorithm and prove its correctness w.r.t. the original sorted data; a problem we tackle using sampling a subset of Poisson-drawn size from each discretized bin. Lastly, since any distribution can be reduced to a continuous distribution, our algorithm is successfully carried to multiple other families of distributions and thus has numerous applications.

Original languageEnglish
JournalAdvances in Neural Information Processing Systems
Volume37
StatePublished - 2024
Event38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, Canada
Duration: 9 Dec 202415 Dec 2024

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Differentially Private Equivalence Testing for Continuous Distributions and Applications'. Together they form a unique fingerprint.

Cite this