On Margin Maximization in Linear and ReLU Networks

Gal Vardi, Ohad Shamir, Nathan Srebro

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The implicit bias of neural networks has been extensively studied in recent years. Lyu and Li [2019] showed that in homogeneous networks trained with the exponential or the logistic loss, gradient flow converges to a KKT point of the max margin problem in parameter space. However, that leaves open the question of whether this point will generally be an actual optimum of the max margin problem. In this paper, we study this question in detail, for several neural network architectures involving linear and ReLU activations. Perhaps surprisingly, we show that in many cases, the KKT point is not even a local optimum of the max margin problem. On the flip side, we identify multiple settings where a local or global optimum can be guaranteed.

Original languageEnglish
Title of host publicationNIPS'22
Subtitle of host publicationProceedings of the 36th International Conference on Neural Information Processing Systems
EditorsS. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
Pages37024-37036
Number of pages13
ISBN (Electronic)9781713871088
StatePublished - 3 Apr 2022
Event36th Conference on Neural Information Processing Systems, NeurIPS 2022 - New Orleans, United States
Duration: 28 Nov 20229 Dec 2022

Publication series

NameAdvances in Neural Information Processing Systems
Volume35

Conference

Conference36th Conference on Neural Information Processing Systems, NeurIPS 2022
Country/TerritoryUnited States
CityNew Orleans
Period28/11/229/12/22

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'On Margin Maximization in Linear and ReLU Networks'. Together they form a unique fingerprint.

Cite this