Abstract
Projection-free optimization algorithms, which are mostly based on the classical Frank-Wolfe method, have gained significant interest in the machine learning community in recent years due to their ability to handle convex constraints that are popular in many applications, but for which computing projections is often computationally impractical in high-dimensional settings, and hence prohibit the use of most standard projection-based methods. In particular, a significant research effort was put on projection-free methods for online learning. In this paper we revisit the Online Frank-Wolfe (OFW) method suggested by Hazan and Kale (2012) and fill a gap that has been left unnoticed for several years: OFW achieves a faster rate of O(T2/3) on strongly convex functions (as opposed to the standard O(T3/4) for convex but not strongly convex functions), where T is the sequence length. This is somewhat surprising since it is known that for offline optimization, in general, strong convexity does not lead to faster rates for Frank-Wolfe. We also revisit the bandit setting under strong convexity and prove a similar bound of Õ(T2/3) (instead of O(T3/4) without strong convexity). Hence, in the current state-of-affairs, the best projection-free upper-bounds for the full-information and bandit settings with strongly convex and nonsmooth functions match up to logarithmic factors in T.
Original language | English |
---|---|
Pages (from-to) | 3592-3600 |
Number of pages | 9 |
Journal | Proceedings of Machine Learning Research |
Volume | 130 |
State | Published - 2021 |
Event | 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021 - Virtual, Online, United States Duration: 13 Apr 2021 → 15 Apr 2021 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability