A Product Engine for Energy-Efficient Execution of Binary Neural Networks Using Resistive Memories

Joao Vieira, Edouard Giacomin, Yasir Qureshi, Marina Zapater, Xifan Tang, Shahar Kvatinsky, David Atienza, Pierre Emmanuel Gaillardon

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The need for running complex Machine Learning (ML) algorithms, such as Convolutional Neural Networks (CNNs), in edge devices, which are highly constrained in terms of computing power and energy, makes it important to execute such applications efficiently. The situation has led to the popularization of Binary Neural Networks (BNNs), which significantly reduce execution time and memory requirements by representing the weights (and possibly the data being operated) using only one bit. Because approximately 90 of the operations executed by CNNs and BNNs are convolutions, a significant part of the memory transfers consists of fetching the convolutional kernels. Such kernels are usually small (e.g., 3×3 operands), and particularly in BNNs redundancy is expected. Therefore, equal kernels can be mapped to the same memory addresses, requiring significantly less memory to store them. In this context, this paper presents a custom Binary Dot Product Engine (BDPE) for BNNs that exploits the features of Resistive Random-Access Memories (RRAMs). This new engine allows accelerating the execution of the inference phase of BNNs. The novel BDPE locally stores the most used binary weights and performs binary convolution using computing capabilities enabled by the RRAMs. The system-level gem5 architectural simulator was used together with a C-based ML framework to evaluate the system's performance and obtain power results. Results show that this novel BDPE improves performance by 11.3, energy efficiency by 7.4 and reduces the number of memory accesses by 10.7 at a cost of less than 0.3 additional die area, when integrated with a 28 nm Fully Depleted Silicon On Insulator ARMv8 in-order core, in comparison to a fully-optimized baseline of YoloV3 XNOR-Net running in a unmodified Central Processing Unit.

Original languageEnglish
Title of host publicationVLSI-SoC 2019 - 27th IFIP/IEEE International Conference on Very Large Scale Integration, Proceedings
EditorsCarolina Metzler, Giovanni De Micheli, Pierre-Emmanuel Gaillardon, Carlos Silva-Cardenas, Ricardo Reis
Pages160-165
Number of pages6
ISBN (Electronic)9781728139159
DOIs
StatePublished - Oct 2019
Event27th IFIP/IEEE International Conference on Very Large Scale Integration, VLSI-SoC 2019 - Cuzco, Peru
Duration: 6 Oct 20199 Oct 2019

Publication series

NameIEEE/IFIP International Conference on VLSI and System-on-Chip, VLSI-SoC
Volume2019-October

Conference

Conference27th IFIP/IEEE International Conference on Very Large Scale Integration, VLSI-SoC 2019
Country/TerritoryPeru
CityCuzco
Period6/10/199/10/19

Keywords

  • Binary Neural Networks
  • Edge Devices
  • Machine Learning
  • RRAM-based Binary Dot Product Engine

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Software
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A Product Engine for Energy-Efficient Execution of Binary Neural Networks Using Resistive Memories'. Together they form a unique fingerprint.

Cite this