TY - GEN
T1 - The Communication Complexity of Set Intersection Under Product Distributions
AU - Oshman, Rotem
AU - Roth, Tal
N1 - Publisher Copyright: © Rotem Oshman and Tal Roth.
PY - 2023/7
Y1 - 2023/7
N2 - We consider a multiparty setting where k parties have private inputs X1, . . ., Xk ⊆ [n] and wish to compute the intersectionTkℓ=1 Xℓ of their sets, using as little communication as possible. This task generalizes the well-known problem of set disjointness, where the parties are required only to determine whether the intersection is empty or not. In the worst-case, it is known that the communication complexity of finding the intersection is the same as that of solving set disjointness, regardless of the size of the intersection: the cost of both problems is Ω (n log k + k) bits in the shared blackboard model, and Ω (nk) bits in the coordinator model. In this work we consider a realistic setting where the parties’ inputs are independent of one another, that is, the input is drawn from a product distribution. We show that this makes finding the intersection significantly easier than in the worst-case: only Θ̃((n1−1/k (H(S) + 1)1/k) + k) bits of communication are required, where H(S) is the Shannon entropy of the intersection S. We also show that the parties do not need to know the exact underlying input distribution; if we are given in advance O(n1/k) samples from the underlying distribution µ, we can learn enough about µ to allow us to compute the intersection of an input drawn from µ using expected communication Θ̃((n1−1/k E[|S|]1/k) + k), where |S| is the size of the intersection.
AB - We consider a multiparty setting where k parties have private inputs X1, . . ., Xk ⊆ [n] and wish to compute the intersectionTkℓ=1 Xℓ of their sets, using as little communication as possible. This task generalizes the well-known problem of set disjointness, where the parties are required only to determine whether the intersection is empty or not. In the worst-case, it is known that the communication complexity of finding the intersection is the same as that of solving set disjointness, regardless of the size of the intersection: the cost of both problems is Ω (n log k + k) bits in the shared blackboard model, and Ω (nk) bits in the coordinator model. In this work we consider a realistic setting where the parties’ inputs are independent of one another, that is, the input is drawn from a product distribution. We show that this makes finding the intersection significantly easier than in the worst-case: only Θ̃((n1−1/k (H(S) + 1)1/k) + k) bits of communication are required, where H(S) is the Shannon entropy of the intersection S. We also show that the parties do not need to know the exact underlying input distribution; if we are given in advance O(n1/k) samples from the underlying distribution µ, we can learn enough about µ to allow us to compute the intersection of an input drawn from µ using expected communication Θ̃((n1−1/k E[|S|]1/k) + k), where |S| is the size of the intersection.
KW - Communication complexity
KW - intersection
KW - set disjointness
UR - http://www.scopus.com/inward/record.url?scp=85167350571&partnerID=8YFLogxK
U2 - https://doi.org/10.4230/LIPIcs.ICALP.2023.95
DO - https://doi.org/10.4230/LIPIcs.ICALP.2023.95
M3 - منشور من مؤتمر
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 50th International Colloquium on Automata, Languages, and Programming, ICALP 2023
A2 - Etessami, Kousha
A2 - Feige, Uriel
A2 - Puppis, Gabriele
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 50th International Colloquium on Automata, Languages, and Programming, ICALP 2023
Y2 - 10 July 2023 through 14 July 2023
ER -