Reducibility and Computational Lower Bounds for Problems with Planted Sparse Structure

Matthew Brennan, Guy Bresler, Wasim Huleihel

Research output: Contribution to journalConference articlepeer-review

Abstract

Recently, research in unsupervised learning has gravitated towards exploring statistical-computational gaps induced by sparsity. A line of work initiated in Berthet and Rigollet (2013a) has aimed to explain these gaps through reductions to conjecturally hard problems from complexity theory. However, the delicate nature of average-case reductions has limited the development of techniques and often led to weaker hardness results that only apply to algorithms robust to different noise distributions or that do not need to know the parameters of the problem. We introduce several new techniques to give a web of average-case reductions showing strong computational lower bounds based on the planted clique conjecture. Our new lower bounds include: • Planted Independent Set: We show tight lower bounds for detecting a planted independent set of size k in a sparse Erdos-R " ényi graph of size n with edge density Θ(~ n−α). • Planted Dense Subgraph: If p > q are the edge densities inside and outside of the community, we show the first lower bounds for the general regime q = Θ(~ n−α) and p − q = Θ(~ n−γ) where γ ≥ α, matching the lower bounds predicted in Chen and Xu (2016). Our lower bounds apply to a deterministic community size k, resolving a question raised in Hajek et al. (2015). • Biclustering: We show strong lower bounds for Gaussian biclustering as a simple hypothesis testing problem to detect a uniformly at random planted flat k × k submatrix. • Sparse Rank-1 Submatrix: We show that detection in the sparse spiked Wigner model is often harder than biclustering, and are able to obtain two different tight lower bounds for these problems with different reductions from planted clique. • Sparse PCA: We give a reduction between rank-1 submatrix and sparse PCA to obtain tight lower bounds in the less sparse regime k >> √n, when the spectral algorithm is optimal over the SDP. We give an alternate reduction recovering the lower bounds of Berthet and Rigollet (2013a); Gao et al. (2017) in the simple hypothesis testing variant of sparse PCA. We also observe a subtlety in the complexity of sparse PCA that arises when the planted vector is biased. • Subgraph Stochastic Block Model: We introduce a model where two small communities are planted in an Erdos-R " ényi graph of the same average edge density and give tight lower bounds yielding different hard regimes than planted dense subgraph. Our results demonstrate that, despite the delicate nature of average-case reductions, using natural problems as intermediates can often be beneficial, as is the case in worst-case complexity. Our main technical contribution is to introduce a set of techniques for average-case reductions that: (1) maintain the level of signal in an instance of a problem; (2) alter its planted structure; and (3) map two initial high-dimensional distributions simultaneously to two target distributions approximately under total variation. We also give algorithms matching our lower bounds and identify the information-theoretic limits of the models we consider.

Original languageAmerican English
Pages (from-to)48-166
Number of pages119
JournalProceedings of Machine Learning Research
Volume75
StatePublished - 1 Jan 2018
Externally publishedYes
Event31st Annual Conference on Learning Theory, COLT 2018 - Stockholm, Sweden
Duration: 6 Jul 20189 Jul 2018

Keywords

  • average-case complexity
  • Planted clique
  • statistical-computational gap

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Reducibility and Computational Lower Bounds for Problems with Planted Sparse Structure'. Together they form a unique fingerprint.

Cite this