TY - JOUR
T1 - PoPS
T2 - Policy Pruning and Shrinking for Deep Reinforcement Learning
AU - Livne, Dor
AU - Cohen, Kobi
N1 - Funding Information: This work was supported in part by the U.S.-Israel Binational Science Foundation (BSF) under Grant 2017723 and in part by the Cyber Security Research Center at Ben-Gurion University of the Negev under Grant 076/16. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Diana Marculescu. (Corresponding author: Kobi Cohen.) Funding Information: Manuscript received May 31, 2019; revised November 18, 2019; accepted January 10, 2020. Date of publication January 17, 2020; date of current version August 10, 2020. This work was supported in part by the U.S.-Israel Binational Science Foundation (BSF) under Grant 2017723 and in part by the Cyber Security Research Center at Ben-Gurion University of the Negev under Grant 076/16. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Diana Marculescu. (Corresponding author: Kobi Cohen.) The authors are with the School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer Sheva 8410501, Israel (e-mail: dorliv@bgu.ac.il; yakovsec@bgu.ac.il). Digital Object Identifier 10.1109/JSTSP.2020.2967566 Publisher Copyright: © 2007-2012 IEEE.
PY - 2020/5/1
Y1 - 2020/5/1
N2 - The recent success of deep neural networks (DNNs) for function approximation in reinforcement learning has triggered the development of Deep Reinforcement Learning (DRL) algorithms in various fields, such as robotics, computer games, natural language processing, computer vision, sensing systems, and wireless networking. Unfortunately, DNNs suffer from high computational cost and memory consumption, which limits the use of DRL algorithms in systems with limited hardware resources. In recent years, pruning algorithms have demonstrated considerable success in reducing the redundancy of DNNs in classification tasks. However, existing algorithms suffer from a significant performance reduction in the DRL domain. In this article, we develop the first effective solution to the performance reduction problem of pruning in the DRL domain, and establish a working algorithm, named Policy Pruning and Shrinking (PoPS), to train DRL models with strong performance while achieving a compact representation of the DNN. The framework is based on a novel iterative policy pruning and shrinking method that leverages the power of transfer learning when training the DRL model. We present an extensive experimental study that demonstrates the strong performance of PoPS using the popular Cartpole, Lunar Lander, Pong, and Pacman environments. Finally, we develop an open source software for the benefit of researchers and developers in related fields.
AB - The recent success of deep neural networks (DNNs) for function approximation in reinforcement learning has triggered the development of Deep Reinforcement Learning (DRL) algorithms in various fields, such as robotics, computer games, natural language processing, computer vision, sensing systems, and wireless networking. Unfortunately, DNNs suffer from high computational cost and memory consumption, which limits the use of DRL algorithms in systems with limited hardware resources. In recent years, pruning algorithms have demonstrated considerable success in reducing the redundancy of DNNs in classification tasks. However, existing algorithms suffer from a significant performance reduction in the DRL domain. In this article, we develop the first effective solution to the performance reduction problem of pruning in the DRL domain, and establish a working algorithm, named Policy Pruning and Shrinking (PoPS), to train DRL models with strong performance while achieving a compact representation of the DNN. The framework is based on a novel iterative policy pruning and shrinking method that leverages the power of transfer learning when training the DRL model. We present an extensive experimental study that demonstrates the strong performance of PoPS using the popular Cartpole, Lunar Lander, Pong, and Pacman environments. Finally, we develop an open source software for the benefit of researchers and developers in related fields.
KW - Deep reinforcement learning (DRL)
KW - deep neural network (DNN)
KW - pruning algorithms
UR - http://www.scopus.com/inward/record.url?scp=85080125012&partnerID=8YFLogxK
U2 - https://doi.org/10.1109/JSTSP.2020.2967566
DO - https://doi.org/10.1109/JSTSP.2020.2967566
M3 - Article
SN - 1932-4553
VL - 14
SP - 789
EP - 801
JO - IEEE Journal on Selected Topics in Signal Processing
JF - IEEE Journal on Selected Topics in Signal Processing
IS - 4
M1 - 8962235
ER -