Abstract
We study the stochastic multi-armed bandits problem in the presence of adversarial corruption. We present a new algorithm for this problem whose regret is nearly optimal, substantially improving upon previous work. Our algorithm is agnostic to the level of adversarial contamination and can tolerate a significant amount of corruption with virtually no degradation in performance.
| Original language | English |
|---|---|
| Pages (from-to) | 1562-1578 |
| Number of pages | 17 |
| Journal | Proceedings of Machine Learning Research |
| Volume | 99 |
| State | Published - 2019 |
| Externally published | Yes |
| Event | 32nd Conference on Learning Theory, COLT 2019 - Phoenix, United States Duration: 25 Jun 2019 → 28 Jun 2019 https://proceedings.mlr.press/v99 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability