Accelerating inference on binary neural networks with digital rram processing

João Vieira, Edouard Giacomin, Yasir Qureshi, Marina Zapater, Xifan Tang, Shahar Kvatinsky, David Atienza, Pierre Emmanuel Gaillardon

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

The need for efficient Convolutional Neural Network (CNNs) targeting embedded systems led to the popularization of Binary Neural Networks (BNNs), which significantly reduce execution time and memory requirements by representing the operands using only one bit. Also, due to 90% of the operations executed by CNNs and BNNs being convolutions, a quest for custom accelerators to optimize the convolution operation and reduce data movements has started, in which Resistive Random Access Memory (RRAM)-based accelerators have proven to be of interest. This work presents a custom Binary Dot Product Engine(BDPE) for BNNs that exploits the low-level compute capabilities enabled RRAMs. This new engine allows accelerating the execution of the inference phase of BNNs by locally storing the most used kernels and performing the binary convolutions using RRAM devices and optimized custom circuitry. Results show that the novel BDPE improves performance by 11.3%, energy efficiency by 7.4% and reduces the number of memory accesses by 10.7% at a cost of less than 0.3% additional die area.

Original languageEnglish
Title of host publicationVLSI-SoC
Subtitle of host publicationNew Technology Enabler - 27th IFIP/IEEE WG 10.5 International Conference on Very Large Scale Integration, VLSI-SoC 2019, Revised and Extended Selected Papers
EditorsCarolina Metzler, Ricardo Reis, Pierre-Emmanuel Gaillardon, Giovanni De Micheli, Carlos Silva-Cardenas
Chapter12
Pages257-278
Number of pages22
DOIs
StatePublished - 2020
Event27th IFIP/IEEE WG 10.5 International Conference on Very Large Scale Integration, VLSI-SoC 2019 - Cusco, Peru
Duration: 6 Oct 20199 Oct 2019

Publication series

NameIFIP Advances in Information and Communication Technology
Volume586 IFIP

Conference

Conference27th IFIP/IEEE WG 10.5 International Conference on Very Large Scale Integration, VLSI-SoC 2019
Country/TerritoryPeru
CityCusco
Period6/10/199/10/19

Keywords

  • Binary Neural Networks
  • Embedded systems
  • Machine Learning
  • RRAM-based Binary Dot Product Engine

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Networks and Communications
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Accelerating inference on binary neural networks with digital rram processing'. Together they form a unique fingerprint.

Cite this