Clustering in the Boolean hypercube in a list decoding regime

Irit Dinur, Elazar Goldenberg

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

We consider the following clustering with outliers problem: Given a set of points X ⊂ {-1,1}n, such that there is some point z ∈ {-1,1}n for which Prx∈X[〈x,z〉 ≥ ε] ≥ δ, find z. We call such a point z a (δ,ε)-center of X. In this work we give lower and upper bounds for the task of finding a (δ,ε)-center. We first show that for δ = 1 - ν close to 1, i.e. in the "unique decoding regime", given a (1 - ν,ε)-centered set our algorithm can find a (1 - (1 + o(1))ν,(1 - o(1))ε)-center. More interestingly, we study the "list decoding regime", i.e. when δ is close to 0. Our main upper bound shows that for values of ε and δ that are larger than 1/polylog(n), there exists a polynomial time algorithm that finds a (δ - o(1),ε - o(1))-center. Moreover, our algorithm outputs a list of centers explaining all of the clusters in the input. Our main lower bound shows that given a set for which there exists a (δ,ε)-center, it is hard to find even a (δ/nc, ε)-center for some constant c and ε = 1/poly(n), δ = 1/poly(n).

Original languageEnglish
Title of host publicationAutomata, Languages, and Programming - 40th International Colloquium, ICALP 2013, Proceedings
Pages413-424
Number of pages12
EditionPART 1
DOIs
StatePublished - 2013
Event40th International Colloquium on Automata, Languages, and Programming, ICALP 2013 - Riga, Latvia
Duration: 8 Jul 201312 Jul 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume7965 LNCS

Conference

Conference40th International Colloquium on Automata, Languages, and Programming, ICALP 2013
Country/TerritoryLatvia
CityRiga
Period8/07/1312/07/13

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Clustering in the Boolean hypercube in a list decoding regime'. Together they form a unique fingerprint.

Cite this