Linear classifiers are nearly optimal when hidden variables have diverse effects

Nader H. Bshouty, Philip M. Long

Research output: Contribution to journalArticlepeer-review

Abstract

We analyze classification problems in which data is generated by a two-tiered random process. The class is generated first, then a layer of conditionally independent hidden variables, and finally the observed variables. For sources like this, the Bayes-optimal rule for predicting the class given the values of the observed variables is a two-layer neural network.We show that, if the hidden variables have non-negligible effects on many observed variables, a linear classifier approximates the error rate of the Bayes optimal classifier up to lower order terms. We also show that the hinge loss of a linear classifier is not much more than the Bayes error rate, which implies that an accurate linear classifier can be found efficiently.

Original languageEnglish
Pages (from-to)209-231
Number of pages23
JournalMachine Learning
Volume86
Issue number2
DOIs
StatePublished - Feb 2012

Keywords

  • Bayes-optimal
  • Hidden variables
  • Learning theory
  • Linear classification

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Linear classifiers are nearly optimal when hidden variables have diverse effects'. Together they form a unique fingerprint.

Cite this