"Convex until proven guilty": Dimension-free acceleration of gradient descent on non-convex functions

Yair Cannon, John C. Duchi, Oliver Hinder, Aaron Sidford

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We develop and analyze a variant of Nesterov's accelerated gradient descent (AGD) for minimization of smooth non-convex functions. We prove that one of two cases occurs: either our AGD variant converges quickly, as if the function was convex, or we produce a certificate that the function is "guilty" of being non-convex. This non-convexity certificate allows us to exploit negative curvature and obtain deterministic, dimension-free acceleration of convergence for non-convex functions. For a function /with Lipschitz continuous gradient and Hessian, we compute a point x with ∥Vf(x)∥ ≤ ϵ in O(ϵ-7/4 log(l/ϵ)) gradient and function evaluations. Assuming additionally that the third derivative is Lipschitz, we require only O(ϵ-5/3log(1/ϵ)) evaluations.

Original languageEnglish
Title of host publication34th International Conference on Machine Learning, ICML 2017
Pages1069-1091
Number of pages23
ISBN (Electronic)9781510855144
StatePublished - 2017
Externally publishedYes
Event34th International Conference on Machine Learning, ICML 2017 - Sydney, Australia
Duration: 6 Aug 201711 Aug 2017

Publication series

Name34th International Conference on Machine Learning, ICML 2017
Volume2

Conference

Conference34th International Conference on Machine Learning, ICML 2017
Country/TerritoryAustralia
CitySydney
Period6/08/1711/08/17

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Human-Computer Interaction
  • Software

Fingerprint

Dive into the research topics of '"Convex until proven guilty": Dimension-free acceleration of gradient descent on non-convex functions'. Together they form a unique fingerprint.

Cite this