Abstract
A well-recognized limitation of kernel learning is the requirement to handle a kernel matrix, whose size is quadratic in the number of training examples. Many methods have been proposed to reduce this computational cost, mostly by using a subset of the kernel matrix entries, or some form of lowrank matrix approximation, or a random projection method. In this paper, we study lower bounds on the error attainable by such methods as a function of the number of entries observed in the kernel matrix or the rank of an approximate kernel matrix. We show that there are kernel learning problems where no such method will lead to non-trivial computational savings. Our results also quantify how the problem difficulty depends on parameters such as the nature of the loss function, the regularization parameter, the norm of the desired predictor, and the kernel matrix rank. Our results also suggest cases where more efficient kernel learning might be possible.
Original language | English |
---|---|
Pages (from-to) | 297-325 |
Number of pages | 29 |
Journal | Journal of Machine Learning Research |
Volume | 40 |
Issue number | 2015 |
DOIs | |
State | Published - 2015 |
Event | 28th Conference on Learning Theory, COLT 2015 - Paris, France Duration: 2 Jul 2015 → 6 Jul 2015 |
All Science Journal Classification (ASJC) codes
- Control and Systems Engineering
- Software
- Statistics and Probability
- Artificial Intelligence