Abstract
Temporally extended actions or options have primarily been applied to speed up reinforcement learning by directing exploration to critical regions of the state space. We show that options may play a critical role in planning as well. To demonstrate this, we analyze the convergence rate of Fitted Value Iteration with options. Our analysis reveals that for pessimistic value function estimates, options can improve the convergence rate compared to Fitted Value Iteration with only primitive actions. Furthermore, options can improve convergence even when they are suboptimal. Our experimental results in two different domains demonstrate the key properties from the analysis. While previous research has primarily considered options as a tool for exploration, our theoretical and experimental results demonstrate that options can play an important role in planning.
Original language | English |
---|---|
Title of host publication | The 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM2013) |
Pages | 9-13 |
Number of pages | 5 |
State | Published - 2013 |
Event | The first Multidisciplinary Conference on Reinforcement Learning and Decision Making - New Jersey, United States Duration: 25 Oct 2013 → 27 Oct 2013 https://rldm.org/past-meetings/rldm2013/ |
Publication series
Name | RLDM 2013 |
---|---|
Publisher | Citeseer |
Conference
Conference | The first Multidisciplinary Conference on Reinforcement Learning and Decision Making |
---|---|
Abbreviated title | RLDM2013 |
Country/Territory | United States |
City | New Jersey |
Period | 25/10/13 → 27/10/13 |
Internet address |