Abstract
Batch normalization (BN) is a key component of most neural network architectures. A major weakness of Batch Normalization is its critical dependence on having a reasonably large batch size, due to the inherent approximation of estimating the mean and variance with a single batch of data. Another weakness is the difficulty of applying BN in autoregressive or structured models. In this study we show that it is feasible to calculate the mean and variance using the entire training dataset instead of standard BN for any network node obtained as a linear function of the input features. We dub this method Full Batch Normalization (FBN). Our main focus is on a factorized autoregressive CRF model where we show that FBN is applicable, and allows for the integration of BN into the linear-chain CRF likelihood. The improved performance of FBN is illustrated on the huge SKU dataset that contains images of retail store product displays.
Original language | English |
---|---|
Pages (from-to) | 2780-2784 |
Number of pages | 5 |
Journal | Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing |
Volume | 2021-June |
DOIs | |
State | Published - 2021 |
Event | 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada Duration: 6 Jun 2021 → 11 Jun 2021 |
Keywords
- Batch normalization
- CRF
- FBN
All Science Journal Classification (ASJC) codes
- Software
- Signal Processing
- Electrical and Electronic Engineering