Abstract
We study the implicit bias of generic optimization methods, including mirror descent, natural gradient descent, and steepest descent with respect to different potentials and norms, when optimizing under determined linear regression or separable linear classification problems. We explore the question of whether the specific global minimum (among the many possible global minima) reached by optimization can be characterized in terms of the potential or norm of the optimization geometry, and independently of hypcrparameter choices such as step size and momentum.
Original language | English |
---|---|
Pages (from-to) | 2932-2955 |
Number of pages | 24 |
Journal | Proceedings of Machine Learning Research |
Volume | 80 |
State | Published - 2018 |
Event | 35th International Conference on Machine Learning, ICML 2018 - Stockholm, Sweden Duration: 10 Jul 2018 → 15 Jul 2018 |
All Science Journal Classification (ASJC) codes
- Computational Theory and Mathematics
- Human-Computer Interaction
- Software