Multi-player bandits - A musical chairs approach

Jonathan Rosenski, Ohad Shamir, Liran Szlak

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We consider a variant of the stochastic multiarmed bandit problem, where multiple players simultaneously choose from the same set of arms and may collide, receiving no reward. This setting has been motivated by problems arising in cognitive radio networks, and is especially challenging under the realistic assumption that communication between players is limited. We provide a communication-free algorithm (Musical Chairs) which attains constant regret with high probability, as well as a sublinear-regret, communication-free algorithm (Dynamic Musical Chairs) for the more difficult setting of players dynamically entering and leaving throughout the game. Moreover, both algorithms do not require prior knowledge of the number of players. To the best of our knowledge, these are the first communication-free algorithms with these types of formal guarantees.

Original languageEnglish
Title of host publication33rd International Conference on Machine Learning, ICML 2016
EditorsMaria Florina Balcan, Kilian Q. Weinberger
Pages276-298
Number of pages23
ISBN (Electronic)9781510829008
DOIs
StatePublished - 19 Jun 2016
Event33rd International Conference on Machine Learning, ICML 2016 - New York City, United States
Duration: 19 Jun 201624 Jun 2016

Publication series

Name33rd International Conference on Machine Learning, ICML 2016
Volume1

Conference

Conference33rd International Conference on Machine Learning, ICML 2016
Country/TerritoryUnited States
CityNew York City
Period19/06/1624/06/16

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Multi-player bandits - A musical chairs approach'. Together they form a unique fingerprint.

Cite this