TY - JOUR
T1 - Reinforcement learning and human behavior
AU - Shteingart, Hanan
AU - Loewenstein, Yonatan
N1 - Funding Information: This work was supported by the Israel Science Foundation (Grant No. 868/08 ), Grant from the Ministry of Science and Technology, Israel and the Ministry of Foreign and European Affairs and the Ministry of Higher Education and Research France and the Gatsby Charitable Foundation .
PY - 2014/4
Y1 - 2014/4
N2 - The dominant computational approach to model operant learning and its underlying neural activity is model-free reinforcement learning (RL). However, there is accumulating behavioral and neuronal-related evidence that human (and animal) operant learning is far more multifaceted. Theoretical advances in RL, such as hierarchical and model-based RL extend the explanatory power of RL to account for some of these findings. Nevertheless, some other aspects of human behavior remain inexplicable even in the simplest tasks. Here we review developments and remaining challenges in relating RL models to human operant learning. In particular, we emphasize that learning a model of the world is an essential step before or in parallel to learning the policy in RL and discuss alternative models that directly learn a policy without an explicit world model in terms of state-action pairs.
AB - The dominant computational approach to model operant learning and its underlying neural activity is model-free reinforcement learning (RL). However, there is accumulating behavioral and neuronal-related evidence that human (and animal) operant learning is far more multifaceted. Theoretical advances in RL, such as hierarchical and model-based RL extend the explanatory power of RL to account for some of these findings. Nevertheless, some other aspects of human behavior remain inexplicable even in the simplest tasks. Here we review developments and remaining challenges in relating RL models to human operant learning. In particular, we emphasize that learning a model of the world is an essential step before or in parallel to learning the policy in RL and discuss alternative models that directly learn a policy without an explicit world model in terms of state-action pairs.
UR - http://www.scopus.com/inward/record.url?scp=84891466050&partnerID=8YFLogxK
U2 - 10.1016/j.conb.2013.12.004
DO - 10.1016/j.conb.2013.12.004
M3 - مقالة مرجعية
C2 - 24709606
SN - 0959-4388
VL - 25
SP - 93
EP - 98
JO - Current Opinion in Neurobiology
JF - Current Opinion in Neurobiology
ER -