Skip to main navigation Skip to search Skip to main content

Learning a Single Neuron with Gradient Methods

Gilad Yehudai, Ohad Shamir

Research output: Contribution to journalConference articlepeer-review

Abstract

We consider the fundamental problem of learning a single neuron x → σ(wx) in a realizable setting, using standard gradient methods with random initialization, and under general families of input distributions and activations. On the one hand, we show that some assumptions on both the distribution and the activation function are necessary. On the other hand, we prove positive guarantees under mild assumptions, which go significantly beyond those studied in the literature so far. We also point out and study the challenges in further strengthening and generalizing our results.

Original languageEnglish
Pages (from-to)3756-3786
Number of pages31
JournalProceedings of Machine Learning Research
Volume125
StatePublished - 2020
Event33rd Conference on Learning Theory, COLT 2020 - Virtual, Online, Austria
Duration: 9 Jul 202012 Jul 2020

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Learning a Single Neuron with Gradient Methods'. Together they form a unique fingerprint.

Cite this