An Optimization and Generalization Analysis for Max-Pooling Networks

Alon Brutzkus, Amir Globerson

Research output: Contribution to journalConference articlepeer-review

Abstract

Max-Pooling operations are a core component of deep learning architectures. In particular, they are part of most convolutional architectures used in machine vision, since pooling is a natural approach to pattern detection problems. However, these architectures are not well understood from a theoretical perspective. For example, we do not understand when they can be globally optimized, and what is the effect of over-parameterization on generalization. Here we perform a theoretical analysis of a convolutional max-pooling architecture, proving that it can be globally optimized, and can generalize well even for highly over-parameterized models. Our analysis focuses on a data generating distribution inspired by pattern detection problem, where a “discriminative” pattern needs to be detected among “spurious” patterns. We empirically validate that CNNs significantly outperform fully connected networks in our setting, as predicted by our theoretical results.

Original languageEnglish
Pages (from-to)1650-1660
Number of pages11
JournalProceedings of Machine Learning Research
Volume161
StatePublished - 2021
Event37th Conference on Uncertainty in Artificial Intelligence, UAI 2021 - Virtual, Online
Duration: 27 Jul 202130 Jul 2021

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'An Optimization and Generalization Analysis for Max-Pooling Networks'. Together they form a unique fingerprint.

Cite this