Why do deep convolutional networks generalize so poorly to small image transformations?

Aharon Azulay, Yair Weiss

Research output: Contribution to journalArticlepeer-review

Abstract

Convolutional Neural Networks (CNNs) are commonly assumed to be invariant to small image transformations: either because of the convolutional architecture or because they were trained using data augmentation. Recently, several authors have shown that this is not the case: small translations or rescalings of the input image can drastically change the network’s prediction. In this paper, we quantify this phenomena and ask why neither the convolutional architecture nor data augmentation are sufficient to achieve the desired invariance. Specifically, we show that the convolutional architecture does not give invariance since architectures ignore the classical sampling theorem, and data augmentation does not give invariance because the CNNs learn to be invariant to transformations only for images that are very similar to typical images from the training set. We discuss two possible solutions to this problem: (1) antialiasing the intermediate representations and (2) increasing data augmentation and show that they provide only a partial solution at best. Taken together, our results indicate that the problem of insuring invariance to small image transformations in neural networks while preserving high accuracy remains unsolved.

Original languageEnglish
Article number184
Number of pages25
JournalJournal of Machine Learning Research
Volume20
StatePublished - 1 Nov 2019

Keywords

  • Deep Convolutional Neural Networks
  • Generalization
  • Machine Learning

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Software
  • Statistics and Probability
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Why do deep convolutional networks generalize so poorly to small image transformations?'. Together they form a unique fingerprint.

Cite this