Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme

Konstantinos E. Nikolakakis, Dionysios S. Kalogerias, Or Sheffet, Anand D. Sarwate

Research output: Contribution to journalArticlepeer-review

Abstract

We study the best-arm identification problem in multi-armed bandits with stochastic rewards when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a successive elimination algorithm for strictly optimal best-arm identification, show that it is δ -PAC and characterize its sample complexity. Further, we provide a lower bound on the expected number of pulls, showing that the proposed algorithm is essentially optimal up to logarithmic factors. Both upper and lower complexity bounds depend on a special definition of the associated suboptimality gap, designed in particular for the quantile bandit problem - as we show, when the gap approaches zero, best-arm identification is impossible. Second, motivated by applications where the rewards are private information, we provide a differentially private successive elimination algorithm whose sample complexity is finite even for distributions with infinite support and characterize its sample complexity. Our algorithms do not require prior knowledge of either the suboptimality gap or other statistical information related to the bandit problem at hand.

Original languageEnglish
Article number9435774
Pages (from-to)534-548
Number of pages15
JournalIEEE journal on selected areas in information theory
Volume2
Issue number2
DOIs
StatePublished - Jun 2021

Keywords

  • Quantile bandits
  • best-arm identification
  • differential privacy
  • sequential estimation
  • value at risk

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Media Technology
  • Artificial Intelligence
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme'. Together they form a unique fingerprint.

Cite this