Abstract
Multilayer Neural Networks (MNNs) are commonly trained using gradient descent-based methods, such as BackPropagation (BP). Inference in probabilistic graphical models is often done using variational Bayes methods, such as Expectation Propagation (EP). We show how an EP based approach can also be used to train deterministic MNNs. Specifically, we approximate the posterior of the weights given the data using a "mean-field" factorized distribution, in an online setting. Using online EP and the central limit theorem we find an analytical approximation to the Bayes update of this posterior, as well as the resulting Bayes estimates of the weights and outputs. Despite a different origin, the resulting algorithm, Expectation BackPropagation (EBP), is very similar to BP in form and efficiency. However, it has several additional advantages: (1) Training is parameter-free, given initial conditions (prior) and the MNN architecture. This is useful for large-scale problems, where parameter tuning is a major challenge. (2) The weights can be restricted to have discrete values. This is especially useful for implementing trained MNNs in precision limited hardware chips, thus improving their speed and energy efficiency by several orders of magnitude. We test the EBP algorithm numerically in eight binary text classification tasks. In all tasks, EBP outperforms: (1) standard BP with the optimal constant learning rate (2) previously reported state of the art. Interestingly, EBP-trained MNNs with binary weights usually perform better than MNNs with continuous (real) weights - if we average the MNN output using the inferred posterior.
Original language | English |
---|---|
Title of host publication | 28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 |
Pages | 963-971 |
Number of pages | 9 |
State | Published - 2014 |
Event | 28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 - Montreal, Canada Duration: 8 Dec 2014 → 13 Dec 2014 |
Conference
Conference | 28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 |
---|---|
Country/Territory | Canada |
City | Montreal |
Period | 8/12/14 → 13/12/14 |
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Information Systems
- Signal Processing