Communication-efficient algorithms for distributed stochastic principal component analysis

Dan Garber, Ohad Shamir, Nathan Srebro

نتاج البحث: فصل من :كتاب / تقرير / مؤتمرمنشور من مؤتمرمراجعة النظراء

ملخص

We study the fundamental problem of Principal Component Analysis in a statistical distributed setting in which each machine out of m stores a sample of n points sampled i.i.d. From a single unknown distribution. We study algorithms for estimating the leading principal component of the population covariance matrix that are both communication-efficient and achieve estimation error of the order of the centralized ERM solution that uses all mn samples. On the negative side, wc show that in contrast to results obtained for distributed estimation under convexity assumptions, for the PCA objective, simply averaging the local ERM solutions cannot guar-antee error that is consistent with the centralized ERM. We show that this unfortunate phenomena can be remedied by performing a simple correction step which correlates between the individual solutions, and provides an estimator that is consistent with the centralized ERM for sufficiently-large n. We also introduce an iterative distributed algorithm that is applicable in any regime of n, which is based on distributed matrix-vector products. The algorithm gives significant acceleration in terms of communication rounds over previous distributed algorithms, in a wide regime of parameters.

اللغة الأصليةالإنجليزيّة
عنوان منشور المضيف34th International Conference on Machine Learning, ICML 2017
الصفحات1943-1964
عدد الصفحات22
رقم المعيار الدولي للكتب (الإلكتروني)9781510855144
حالة النشرنُشِر - 2017
الحدث34th International Conference on Machine Learning, ICML 2017 - Sydney, أستراليا
المدة: ٦ أغسطس ٢٠١٧١١ أغسطس ٢٠١٧

سلسلة المنشورات

الاسم34th International Conference on Machine Learning, ICML 2017
مستوى الصوت3

!!Conference

!!Conference34th International Conference on Machine Learning, ICML 2017
الدولة/الإقليمأستراليا
المدينةSydney
المدة٦/٠٨/١٧١١/٠٨/١٧

All Science Journal Classification (ASJC) codes

  • !!Computational Theory and Mathematics
  • !!Human-Computer Interaction
  • !!Software

بصمة

أدرس بدقة موضوعات البحث “Communication-efficient algorithms for distributed stochastic principal component analysis'. فهما يشكلان معًا بصمة فريدة.

قم بذكر هذا