Action Redundancy in Reinforcement Learning

Nir Baram, Guy Tennenholtz, Shie Mannor

Research output: Contribution to journalConference articlepeer-review

Abstract

Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning paradigm which seeks to maximize return under entropy regularization. However, action entropy does not necessarily coincide with state entropy, e.g., when multiple actions produce the same transition. Instead, we propose to maximize the transition entropy, i.e., the entropy of next states. We show that transition entropy can be described by two terms; namely, model-dependent transition entropy and action redundancy. Particularly, we explore the latter in both deterministic and stochastic settings and develop tractable approximation methods in a near model-free setup. We construct algorithms to minimize action redundancy and demonstrate their effectiveness on a synthetic environment with multiple redundant actions as well as contemporary benchmarks in Atari and Mujoco. Our results suggest that action redundancy is a fundamental problem in reinforcement learning.

Original languageEnglish
Pages (from-to)376-385
Number of pages10
JournalProceedings of Machine Learning Research
Volume161
StatePublished - 2021
Event37th Conference on Uncertainty in Artificial Intelligence, UAI 2021 - Virtual, Online
Duration: 27 Jul 202130 Jul 2021

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Action Redundancy in Reinforcement Learning'. Together they form a unique fingerprint.

Cite this