Stochastic bandits with pathwise constraints

Orly Avner, Shie Mannor

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We consider the problem of stochastic bandits, with the goal of maximizing a reward while satisfying pathwise constraints. The motivation for this problem comes from cognitive radio networks, in which agents need to choose between different transmission profiles to maximize throughput under certain operational constraints such as limited average power. Stochastic bandits serve as a natural model for an unknown, stationary environment. We propose an algorithm, based on a steering approach, and analyze its regret with respect to the optimal stationary policy that knows the statistics of the different arms.

Original languageEnglish
Title of host publication2011 50th IEEE Conference on Decision and Control and European Control Conference, CDC-ECC 2011
Pages3862-3869
Number of pages8
DOIs
StatePublished - 2011
Event2011 50th IEEE Conference on Decision and Control and European Control Conference, CDC-ECC 2011 - Orlando, FL, United States
Duration: 12 Dec 201115 Dec 2011

Publication series

NameProceedings of the IEEE Conference on Decision and Control

Conference

Conference2011 50th IEEE Conference on Decision and Control and European Control Conference, CDC-ECC 2011
Country/TerritoryUnited States
CityOrlando, FL
Period12/12/1115/12/11

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Modelling and Simulation
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Stochastic bandits with pathwise constraints'. Together they form a unique fingerprint.

Cite this