Improved Regret Bounds for Projection-free Bandit Convex Optimization

Dan Garber, Ben Kretzu

Research output: Other contributionpeer-review

Abstract

We revisit the challenge of designing online algorithms for the bandit convex optimization problem (BCO) which are also scal-able to high dimensional problems. Hence, we consider algorithms that are projection-free, i.e., based on the conditional gradient method whose only access to the feasible decision set is through a linear optimization oracle (as opposed to other methods which require potentially much more computationally-expensive subproce-dures, such as computing Euclidean projections). We present the first such algorithm that attains O(T 3/4) expected regret using only O(T) overall calls to the linear optimization oracle, in expectation, where T is the number of prediction rounds. This improves over the O(T 4/5) expected regret bound recently obtained by Chen et al. (2019), and actually matches the current best regret bound for projection-free online learning in the full information setting.
Original languageAmerican English
Number of pages11
StatePublished - 2020

Cite this