The advantage of planning with options

Timothy A Mann, Shie Mannor

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Temporally extended actions or options have primarily been applied to speed up reinforcement learning by directing exploration to critical regions of the state space. We show that options may play a critical role in planning as well. To demonstrate this, we analyze the convergence rate of Fitted Value Iteration with options. Our analysis reveals that for pessimistic value function estimates, options can improve the convergence rate compared to Fitted Value Iteration with only primitive actions. Furthermore, options can improve convergence even when they are suboptimal. Our experimental results in two different domains demonstrate the key properties from the analysis. While previous research has primarily considered options as a tool for exploration, our theoretical and experimental results demonstrate that options can play an important role in planning.
Original languageEnglish
Title of host publicationThe 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM2013)
Pages9-13
Number of pages5
StatePublished - 2013
EventThe first Multidisciplinary Conference on
Reinforcement Learning and Decision Making
- New Jersey, United States
Duration: 25 Oct 201327 Oct 2013
https://rldm.org/past-meetings/rldm2013/

Publication series

NameRLDM 2013
PublisherCiteseer

Conference

ConferenceThe first Multidisciplinary Conference on
Reinforcement Learning and Decision Making
Abbreviated titleRLDM2013
Country/TerritoryUnited States
CityNew Jersey
Period25/10/1327/10/13
Internet address

Cite this