Abstract
In this work we lower bound the individual sequence anytime regret of a large family of online algorithms. This bound depends on the quadratic variation of the sequence, (Formula presented.) , and the learning rate. Nevertheless, we show that any learning rate that guarantees a regret upper bound of (Formula presented.) necessarily implies an (Formula presented.) anytime regret on any sequence with quadratic variation (Formula presented.). The algorithms we consider are online linear optimization forecasters whose weight vector at time (Formula presented.) is the gradient of a concave potential function of cumulative losses at time t. We show that these algorithms include all linear Regularized Follow the Leader algorithms. We prove our result for the case of potentials with negative definite Hessians, and potentials for the best expert setting satisfying some natural regularity conditions. In the best expert setting, we give our result in terms of the translation-invariant relative quadratic variation. We apply our lower bounds to Randomized Weighted Majority and to linear cost Online Gradient Descent. We show that our analysis can be generalized to accommodate diverse measures of variation beside quadratic variation. We apply this generalized analysis to Online Gradient Descent with a regret upper bound that depends on the variance of losses.
Original language | English |
---|---|
Pages (from-to) | 1-26 |
Number of pages | 26 |
Journal | Machine Learning |
Volume | 103 |
Issue number | 1 |
DOIs | |
State | Published - 1 Apr 2016 |
Keywords
- Online learning
- Online linear optimization
- Regret lower bounds
- Regret minimization
- Regularized Follow the Leader
All Science Journal Classification (ASJC) codes
- Software
- Artificial Intelligence