TY - GEN
T1 - Oracle complexity of second-order methods for finite-sum problems
AU - Arjevani, Yossi
AU - Shamir, Ohad
N1 - Publisher Copyright: © 2017 International Machine Learning Society (IMLS). All rights reserved.
PY - 2017
Y1 - 2017
N2 - Finite-sum optimization problems are ubiquitous in machine learning, and are commonly solved using first-order methods which rely on gradient computations. Recently, there has been growing interest in second-order methods, which rely on both gradients and Hessians. In principle, second-order methods can require much fewer iterations than first-order methods, and hold the promise for more efficient algorithms. Although computing and manipulating Hessians is prohibitive for high-dimensional problems in general, the Hessians of individual functions in finite-sum problems can often be efficiently computed, e.g. because they possess a low-rank structure. Can second-order information indeed be used to solve such problems more efficiently? In this paper, we provide evidence that the answer - perhaps surprisingly - is negative, at least in terms of worst-case guarantees. We also discuss what additional assumptions and algorithmic approaches might potentially circumvent this negative result.
AB - Finite-sum optimization problems are ubiquitous in machine learning, and are commonly solved using first-order methods which rely on gradient computations. Recently, there has been growing interest in second-order methods, which rely on both gradients and Hessians. In principle, second-order methods can require much fewer iterations than first-order methods, and hold the promise for more efficient algorithms. Although computing and manipulating Hessians is prohibitive for high-dimensional problems in general, the Hessians of individual functions in finite-sum problems can often be efficiently computed, e.g. because they possess a low-rank structure. Can second-order information indeed be used to solve such problems more efficiently? In this paper, we provide evidence that the answer - perhaps surprisingly - is negative, at least in terms of worst-case guarantees. We also discuss what additional assumptions and algorithmic approaches might potentially circumvent this negative result.
UR - http://www.scopus.com/inward/record.url?scp=85048671695&partnerID=8YFLogxK
M3 - منشور من مؤتمر
T3 - 34th International Conference on Machine Learning, ICML 2017
SP - 274
EP - 297
BT - 34th International Conference on Machine Learning, ICML 2017
T2 - 34th International Conference on Machine Learning, ICML 2017
Y2 - 6 August 2017 through 11 August 2017
ER -