Abstract
We consider a multi-pool version of the infinitely-many armed bandit problem, where a
learning agent is faced with several large pools of items, and interested in finding the best
item overall. At each time step the agent chooses a pool, and obtains a random item whose
value is precisely revealed. The obtained values within each pool are assumed to be i.i.d.,
with an unknown probability distribution that generally differs among the pools. Under the
PAC framework, we provide lower bounds on the sample complexity of any (, δ)-correct
algorithm, and propose an algorithm that attains this bound up to logarithmic factors. We
compare the performance of this multi-pool algorithm to the variant in which the pools are
not distinguishable by the agent and are chosen randomly at each stage. Interestingly, when
the supremal values of the pools happen to be similar, the latter approach may provide
better performance
learning agent is faced with several large pools of items, and interested in finding the best
item overall. At each time step the agent chooses a pool, and obtains a random item whose
value is precisely revealed. The obtained values within each pool are assumed to be i.i.d.,
with an unknown probability distribution that generally differs among the pools. Under the
PAC framework, we provide lower bounds on the sample complexity of any (, δ)-correct
algorithm, and propose an algorithm that attains this bound up to logarithmic factors. We
compare the performance of this multi-pool algorithm to the variant in which the pools are
not distinguishable by the agent and are chosen randomly at each stage. Interestingly, when
the supremal values of the pools happen to be similar, the latter approach may provide
better performance
Original language | American English |
---|---|
Title of host publication | The 12th European Workshop on Reinforcement Learning |
State | Published - 2015 |
Event | The 12th European Workshop on Reinforcement Learning - Lille, France Duration: 10 Jul 2015 → 11 Jul 2015 Conference number: 12 https://ewrl.wordpress.com/past-ewrl/ewrl12-2015/ |
Conference
Conference | The 12th European Workshop on Reinforcement Learning |
---|---|
Abbreviated title | EWRL |
Period | 10/07/15 → 11/07/15 |
Internet address |