Criticality-Based Advice in Reinforcement Learning

Yitzhak Spielberg, Amos Azaria

Research output: Contribution to conferencePaperpeer-review

Abstract

One of the ways to make reinforcement learning (RL) more efficient is by utilizing human advice. Because human advice is expensive, the central question in advice-based reinforcement learning is, how to decide in which states the agent should ask for advice. To approach this challenge, various advice strategies have been proposed. Although all of these strategies distribute advice more efficiently than naive strategies (such as choosing random states), they rely solely on the agent's internal representation of the task (the action-value function, the policy, etc.) and therefore, are rather inefficient when this representation is not accurate, in particular, in the early stages of the learning process. To address this weakness, we propose an approach to advice-based RL, in which the human's role is not limited to giving advice in chosen states, but also includes hinting apriori (before the learning procedure) which sub-domains of the state space require more advice. Specifically, we suggest different ways to improve any given advice strategy by utilizing the concept of critical states: states in which it is very important to choose the correct action. Finally, we present experiments in 2 environments that validate the efficiency of our approach.

Original languageEnglish
Pages1925-1931
Number of pages7
StatePublished - 2022
Event44th Annual Meeting of the Cognitive Science Society: Cognitive Diversity, CogSci 2022 - Toronto, Canada
Duration: 27 Jul 202230 Jul 2022

Conference

Conference44th Annual Meeting of the Cognitive Science Society: Cognitive Diversity, CogSci 2022
Country/TerritoryCanada
CityToronto
Period27/07/2230/07/22

Keywords

  • interactive machine learning
  • reinforcement learning

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Human-Computer Interaction
  • Cognitive Neuroscience

Fingerprint

Dive into the research topics of 'Criticality-Based Advice in Reinforcement Learning'. Together they form a unique fingerprint.

Cite this