Value iteration networks

Aviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine, Pieter Abbeel

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We introduce the value iteration network (VIN): a fully differentiable neural network with a 'planning module' embedded within. VINs can learn to plan, and are suitable for predicting outcomes that involve planning-based reasoning, such as policies for reinforcement learning. Key to our approach is a novel differentiable approximation of the value-iteration algorithm, which can be represented as a convolutional neural network, and trained end-to-end using standard backpropagation. We evaluate VIN based policies on discrete and continuous path-planning domains, and on a natural-language based search task. We show that by learning an explicit planning computation, VIN policies generalize better to new, unseen domains. This paper is a significantly abridged and IJCAI audience targeted version of the original NIPS 2016 paper with the same title, available here: https://arxiv.org/abs/1602.02867.

Original languageEnglish
Title of host publication26th International Joint Conference on Artificial Intelligence, IJCAI 2017
EditorsCarles Sierra
Pages4949-4953
Number of pages5
ISBN (Electronic)9780999241103
DOIs
StatePublished - 2017
Externally publishedYes
Event26th International Joint Conference on Artificial Intelligence, IJCAI 2017 - Melbourne, Australia
Duration: 19 Aug 201725 Aug 2017

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
Volume0

Conference

Conference26th International Joint Conference on Artificial Intelligence, IJCAI 2017
Country/TerritoryAustralia
CityMelbourne
Period19/08/1725/08/17

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Value iteration networks'. Together they form a unique fingerprint.

Cite this