Abstract
Recently, partially observable Markov decision processes (POMDP) solvers have shown the ability to scale up significantly using domain structure, such as factored representations. In many domains, the agent is required to complete a set of independent tasks. We propose to decompose a factored POMDP into a set of restricted POMDPs over subsets of task relevant state variables. We solve each such model independently, acquiring a value function. The combination of the value functions of the restricted POMDPs is then used to form a policy for the complete POMDP. We explain the process of identifying variables that correspond to tasks, and how to create a model restricted to a single task, or to a subset of tasks. We demonstrate our approach on a number of benchmarks from the factored POMDP literature, showing that our methods are applicable to models with more than 100 state variables.
Original language | American English |
---|---|
Article number | 6494590 |
Pages (from-to) | 208-216 |
Number of pages | 9 |
Journal | IEEE Transactions on Cybernetics |
Volume | 44 |
Issue number | 2 |
DOIs | |
State | Published - 1 Feb 2014 |
Keywords
- Factored POMDP
- partially observable Markov decision processes (POMDP)
- point-based algorithms
All Science Journal Classification (ASJC) codes
- Software
- Control and Systems Engineering
- Information Systems
- Human-Computer Interaction
- Computer Science Applications
- Electrical and Electronic Engineering