Symmetry & critical points for a model shallow neural network

Yossi Arjevani, Michael Field

פרסום מחקרי: פרסום בכתב עתמאמרביקורת עמיתים

תקציר

Using methods based on the analysis of real analytic functions, symmetry and equivariant bifurcation theory, we obtain sharp results on families of critical points of spurious minima that occur in optimization problems associated with fitting two-layer ReLU networks with k hidden neurons. The main mathematical result proved is to obtain power series representations of families of critical points of spurious minima in terms of 1/k (coefficients independent of k). We also give a path based formulation that naturally connects the critical points with critical points of an associated linear, but highly singular, optimization problem. These critical points closely approximate the critical points in the original problem. The mathematical theory is used to derive results on the original problem in neural nets. For example, precise estimates for several quantities that show that not all spurious minima are alike. In particular, we show that while the loss function at certain types of spurious minima decays to zero like k−1, in other cases the loss converges to a strictly positive constant.

שפה מקוריתאנגלית אמריקאית
מספר המאמר133014
כתב עתPhysica D: Nonlinear Phenomena
כרך427
מזהי עצם דיגיטלי (DOIs)
סטטוס פרסוםפורסם - דצמ׳ 2021
פורסם באופן חיצוניכן

ASJC Scopus subject areas

  • ???subjectarea.asjc.3100.3109???
  • ???subjectarea.asjc.2600.2610???
  • ???subjectarea.asjc.3100.3104???
  • ???subjectarea.asjc.2600.2604???

טביעת אצבע

להלן מוצגים תחומי המחקר של הפרסום 'Symmetry & critical points for a model shallow neural network'. יחד הם יוצרים טביעת אצבע ייחודית.

פורמט ציטוט ביבליוגרפי