Online PCA for contaminated data

Jiashi Feng, Huan Xu, Shie Mannor, Shuicheng Yan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We consider the online Principal Component Analysis (PCA) where contaminated samples (containing outliers) are revealed sequentially to the Principal Components (PCs) estimator. Due to their sensitiveness to outliers, previous online PCA algorithms fail in this case and their results can be arbitrarily skewed by the outliers. Here we propose the online robust PCA algorithm, which is able to improve the PCs estimation upon an initial one steadily, even when faced with a constant fraction of outliers. We show that the final result of the proposed online RPCA has an acceptable degradation from the optimum. Actually, under mild conditions, online RPCA achieves the maximal robustness with a 50% breakdown point. Moreover, online RPCA is shown to be efficient for both storage and computation, since it need not re-explore the previous samples as in traditional robust PCA algorithms. This endows online RPCA with scalability for large scale data.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 26
Subtitle of host publicationNIPS 2013
StatePublished - 2013
Event27th Annual Conference on Neural Information Processing Systems, NIPS 2013 - Lake Tahoe, NV, United States
Duration: 5 Dec 201310 Dec 2013

Publication series

NameAdvances in Neural Information Processing Systems

Conference

Conference27th Annual Conference on Neural Information Processing Systems, NIPS 2013
Country/TerritoryUnited States
CityLake Tahoe, NV
Period5/12/1310/12/13

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Computer Networks and Communications
  • Information Systems

Fingerprint

Dive into the research topics of 'Online PCA for contaminated data'. Together they form a unique fingerprint.

Cite this