Communication-efficient algorithms for distributed stochastic principal component analysis

Dan Garber, Ohad Shamir, Nathan Srebro

פרסום מחקרי: פרק בספר / בדוח / בכנספרסום בספר כנסביקורת עמיתים

תקציר

We study the fundamental problem of Principal Component Analysis in a statistical distributed setting in which each machine out of m stores a sample of n points sampled i.i.d. From a single unknown distribution. We study algorithms for estimating the leading principal component of the population covariance matrix that are both communication-efficient and achieve estimation error of the order of the centralized ERM solution that uses all mn samples. On the negative side, wc show that in contrast to results obtained for distributed estimation under convexity assumptions, for the PCA objective, simply averaging the local ERM solutions cannot guar-antee error that is consistent with the centralized ERM. We show that this unfortunate phenomena can be remedied by performing a simple correction step which correlates between the individual solutions, and provides an estimator that is consistent with the centralized ERM for sufficiently-large n. We also introduce an iterative distributed algorithm that is applicable in any regime of n, which is based on distributed matrix-vector products. The algorithm gives significant acceleration in terms of communication rounds over previous distributed algorithms, in a wide regime of parameters.

שפה מקוריתאנגלית
כותר פרסום המארח34th International Conference on Machine Learning, ICML 2017
עמודים1943-1964
מספר עמודים22
מסת"ב (אלקטרוני)9781510855144
סטטוס פרסוםפורסם - 2017
אירוע34th International Conference on Machine Learning, ICML 2017 - Sydney, אוסטרליה
משך הזמן: 6 אוג׳ 201711 אוג׳ 2017

סדרות פרסומים

שם34th International Conference on Machine Learning, ICML 2017
כרך3

כנס

כנס34th International Conference on Machine Learning, ICML 2017
מדינה/אזוראוסטרליה
עירSydney
תקופה6/08/1711/08/17

ASJC Scopus subject areas

  • ???subjectarea.asjc.1700.1703???
  • ???subjectarea.asjc.1700.1709???
  • ???subjectarea.asjc.1700.1712???

טביעת אצבע

להלן מוצגים תחומי המחקר של הפרסום 'Communication-efficient algorithms for distributed stochastic principal component analysis'. יחד הם יוצרים טביעת אצבע ייחודית.

פורמט ציטוט ביבליוגרפי