Expectation backpropagation: Parameter-free training of multilayer neural networks with continuous or discrete weights

Daniel Soudry, Itay Hubara, Ron Meir

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Multilayer Neural Networks (MNNs) are commonly trained using gradient descent-based methods, such as BackPropagation (BP). Inference in probabilistic graphical models is often done using variational Bayes methods, such as Expectation Propagation (EP). We show how an EP based approach can also be used to train deterministic MNNs. Specifically, we approximate the posterior of the weights given the data using a "mean-field" factorized distribution, in an online setting. Using online EP and the central limit theorem we find an analytical approximation to the Bayes update of this posterior, as well as the resulting Bayes estimates of the weights and outputs. Despite a different origin, the resulting algorithm, Expectation BackPropagation (EBP), is very similar to BP in form and efficiency. However, it has several additional advantages: (1) Training is parameter-free, given initial conditions (prior) and the MNN architecture. This is useful for large-scale problems, where parameter tuning is a major challenge. (2) The weights can be restricted to have discrete values. This is especially useful for implementing trained MNNs in precision limited hardware chips, thus improving their speed and energy efficiency by several orders of magnitude. We test the EBP algorithm numerically in eight binary text classification tasks. In all tasks, EBP outperforms: (1) standard BP with the optimal constant learning rate (2) previously reported state of the art. Interestingly, EBP-trained MNNs with binary weights usually perform better than MNNs with continuous (real) weights - if we average the MNN output using the inferred posterior.

Original languageEnglish
Title of host publication28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014
Pages963-971
Number of pages9
StatePublished - 2014
Event28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 - Montreal, Canada
Duration: 8 Dec 201413 Dec 2014

Conference

Conference28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014
Country/TerritoryCanada
CityMontreal
Period8/12/1413/12/14

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Expectation backpropagation: Parameter-free training of multilayer neural networks with continuous or discrete weights'. Together they form a unique fingerprint.

Cite this