Abstract
We study best-of-both-worlds algorithms for bandits with switching cost, recently addressed by Rouyer, Seldin, and Cesa-Bianchi [14]. We introduce a surprisingly simple and effective algorithm that simultaneously achieves minimax optimal regret bound (up to logarithmic factors) of O(T2/3) in the oblivious adversarial setting and a bound of O(min{log(T)/∆2,T2/3}) in the stochastically-constrained regime, both with (unit) switching costs, where ∆ is the gap between the arms. In the stochastically constrained case, our bound improves over previous results due to [14], that achieved regret of O(T1/3/∆). We accompany our results with a lower bound showing that, in general, Ω̃(min{1/∆2,T2/3}) switching cost regret is unavoidable in the stochastically-constrained case for algorithms with O(T2/3) worst-case switching cost regret.
| Original language | English |
|---|---|
| Title of host publication | Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022 |
| Editors | S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh |
| Publisher | Neural information processing systems foundation |
| ISBN (Electronic) | 9781713871088 |
| State | Published - 2022 |
| Event | 36th Conference on Neural Information Processing Systems, NeurIPS 2022 - New Orleans, United States Duration: 28 Nov 2022 → 9 Dec 2022 |
Publication series
| Name | Advances in Neural Information Processing Systems |
|---|---|
| Volume | 35 |
Conference
| Conference | 36th Conference on Neural Information Processing Systems, NeurIPS 2022 |
|---|---|
| Country/Territory | United States |
| City | New Orleans |
| Period | 28/11/22 → 9/12/22 |
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Information Systems
- Signal Processing