Kernel methods match deep neural networks on TIMIT

Po Sen Huang, Haim Avron, Tara N. Sainath, Vikas Sindhwani, Bhuvana Ramabhadran

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Despite their theoretical appeal and grounding in tractable convex optimization techniques, kernel methods are often not the first choice for large-scale speech applications due to their significant memory requirements and computational expense. In recent years, randomized approximate feature maps have emerged as an elegant mechanism to scale-up kernel methods. Still, in practice, a large number of random features is required to obtain acceptable accuracy in predictive tasks. In this paper, we develop two algorithmic schemes to address this computational bottleneck in the context of kernel ridge regression. The first scheme is a specialized distributed block coordinate descent procedure that avoids the explicit materialization of the feature space data matrix, while the second scheme gains efficiency by combining multiple weak random feature models in an ensemble learning framework. We demonstrate that these schemes enable kernel methods to match the performance of state of the art Deep Neural Networks on TIMIT for speech recognition and classification tasks. In particular, we obtain the best classification error rates reported on TIMIT using kernel methods.

Original languageEnglish
Title of host publication2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages205-209
Number of pages5
ISBN (Print)9781479928927
DOIs
StatePublished - 2014
Externally publishedYes
Event2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence, Italy
Duration: 4 May 20149 May 2014

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Conference

Conference2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
Country/TerritoryItaly
CityFlorence
Period4/05/149/05/14

Keywords

  • deep learning
  • distributed computing
  • large-scale kernel machines
  • random features
  • speech recognition

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Kernel methods match deep neural networks on TIMIT'. Together they form a unique fingerprint.

Cite this