Abstract
We prove lower bounds on the complexity of finding ϵ-stationary points (points x such that ‖ ∇ f(x) ‖ ≤ ϵ) of smooth, high-dimensional, and potentially non-convex functions f. We consider oracle-based complexity measures, where an algorithm is given access to the value and all derivatives of f at a query point x. We show that for any (potentially randomized) algorithm A, there exists a function f with Lipschitz pth order derivatives such that A requires at least ϵ-(p+1)/p queries to find an ϵ-stationary point. Our lower bounds are sharp to within constants, and they show that gradient descent, cubic-regularized Newton’s method, and generalized pth order regularization are worst-case optimal within their natural function classes.
Original language | English |
---|---|
Pages (from-to) | 71-120 |
Number of pages | 50 |
Journal | Mathematical Programming |
Volume | 184 |
Issue number | 1-2 |
DOIs | |
State | Published - 1 Nov 2020 |
Externally published | Yes |
Keywords
- Cubic regularization of Newton’s method
- Dimension-free rates
- Gradient descent
- Information-based complexity
- Non-convex optimization
All Science Journal Classification (ASJC) codes
- Software
- General Mathematics