Bayesian reinforcement learning

Nikos Vlassis, Mohammad Ghavamzadeh, Shie Mannor, Pascal Poupart

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review


This chapter surveys recent lines of work that use Bayesian techniques for reinforcement learning. In Bayesian learning, uncertainty is expressed by a prior distribution over unknown parameters and learning is achieved by computing a posterior distribution based on the data observed. Hence, Bayesian reinforcement learning distinguishes itself from other forms of reinforcement learning by explicitly maintaining a distribution over various quantities such as the parameters of the model, the value function, the policy or its gradient. This yields several benefits: a) domain knowledge can be naturally encoded in the prior distribution to speed up learning; b) the exploration/exploitation tradeoff can be naturally optimized; and c) notions of risk can be naturally taken into account to obtain robust policies.

Original languageEnglish
Title of host publicationAdaptation, Learning, and Optimization
Number of pages28
StatePublished - 2012

Publication series

NameAdaptation, Learning, and Optimization


  • Covariance

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'Bayesian reinforcement learning'. Together they form a unique fingerprint.

Cite this