Lower Bounds on the Generalization Error of Nonlinear Learning Models

Inbar Seroussi, Ofer Zeitouni

Research output: Contribution to journalArticlepeer-review

Abstract

We study in this paper lower bounds for the generalization error of models derived from multi-layer neural networks, in the regime where the size of the layers is commensurate with the number of samples in the training data. We derive explicit generalization lower bounds for general biased estimators, in the cases of two-layered networks. For linear activation function, the bound is asymptotically tight. In the nonlinear case, we provide a comparison of our bounds with an empirical study of the stochastic gradient descent algorithm. In addition, we derive bounds for unbiased estimators, which show that the latter have unacceptable performance for truly nonlinear networks. The analysis uses elements from the theory of large random matrices.

Original languageEnglish
Pages (from-to)7956-7970
Number of pages15
JournalIEEE Transactions on Information Theory
Volume68
Issue number12
Early online date11 Jul 2022
DOIs
StatePublished - 1 Dec 2022

Keywords

  • Cramer-Rao bound
  • generalization error
  • learning
  • random matrices

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Lower Bounds on the Generalization Error of Nonlinear Learning Models'. Together they form a unique fingerprint.

Cite this