Abstract
We study the overfitting behavior of fully connected deep Neural Networks (NNs) with binary weights fitted to perfectly classify a noisy training set. We consider interpolation using both the smallest NN (having the minimal number of weights) and a random interpolating NN. For both learning rules, we prove overfitting is tempered. Our analysis rests on a new bound on the size of a threshold circuit consistent with a partial function. To the best of our knowledge, ours are the first theoretical results on benign or tempered overfitting that: (1) apply to deep NNs, and (2) do not require a very high or very low input dimension.
| Original language | English |
|---|---|
| Number of pages | 67 |
| Journal | Advances in Neural Information Processing Systems |
| Volume | 37 |
| State | Published - 25 Sep 2024 |
| Event | 38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, Canada Duration: 9 Dec 2024 → 15 Dec 2024 |
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Information Systems
- Signal Processing
Fingerprint
Dive into the research topics of 'Provable Tempered Overfitting of Minimal Nets and Typical Nets'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver