Action Redundancy in Reinforcement Learning

Nir Baram, Guy Tennenholtz, Shie Mannor

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning paradigm which seeks to maximize return under entropy regularization. However, action entropy does not necessarily coincide with state entropy, e.g., when multiple actions produce the same transition. Instead, we propose to maximize the transition entropy, i.e., the entropy of next states. We show that transition entropy can be described by two terms; namely, model-dependent transition entropy and action redundancy. Particularly, we explore the latter in both deterministic and stochastic settings and develop tractable approximation methods in a near model-free setup. We construct algorithms to minimize action redundancy and demonstrate their effectiveness on a synthetic environment with multiple redundant actions as well as contemporary benchmarks in Atari and Mujoco. Our results suggest that action redundancy is a fundamental problem in reinforcement learning.

Original languageEnglish
Title of host publication37th Conference on Uncertainty in Artificial Intelligence, UAI 2021
Pages376-385
Number of pages10
StatePublished - 2021
Event37th Conference on Uncertainty in Artificial Intelligence, UAI 2021 - Virtual, Online
Duration: 27 Jul 202130 Jul 2021

Conference

Conference37th Conference on Uncertainty in Artificial Intelligence, UAI 2021
CityVirtual, Online
Period27/07/2130/07/21

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Action Redundancy in Reinforcement Learning'. Together they form a unique fingerprint.

Cite this