TY - GEN
T1 - Nonparametric Canonical Correlation Analysis
AU - Michaeli, Tomer
AU - Wang, VVeiran
AU - Livescu, Karen
N1 - Publisher Copyright: © 2016 by the author(s).
PY - 2016
Y1 - 2016
N2 - Canonical correlation analysis (CCA) is a classical representation learning technique for finding correlated variables in multi-view data. Several nonlinear extensions of the original linear CCA have been proposed, including kernel and deep neural network methods. These approaches seek maximally correlated projections among families of functions, which the user specifies (by choos-ing a kernel or neural network structure), and are computationally demanding. Interestingly, the theory of nonlinear CCA. without functional restrictions. had been studied in the population setting by Lancaster already in the 1950s, but these results have not inspired practical algorithms. We revisit Lancaster's theory to devise a practical algorithm for nonparametric CCA (NCCA). Specifically, we show that the solution can be expressed in terms of the singular value decomposition of a certain operator associated with the joint density of the views. Thus, by estimating the population density from data, NCCA reduces to solving an eigenvalue system, superficially like kernel CCA but, importantly, without requiring the inversion of any kernel matrix. We also derive a partially linear CCA (PLCCA) variant in which one of the views undergoes a linear projection while the other is nonparametric. Using a kernel density estimate based on a small number of nearest neighbors, our NCCA and PLCCA algorithms are memory-efficient, often run much faster, and perform better than kernel CCA and comparable to deep CCA.
AB - Canonical correlation analysis (CCA) is a classical representation learning technique for finding correlated variables in multi-view data. Several nonlinear extensions of the original linear CCA have been proposed, including kernel and deep neural network methods. These approaches seek maximally correlated projections among families of functions, which the user specifies (by choos-ing a kernel or neural network structure), and are computationally demanding. Interestingly, the theory of nonlinear CCA. without functional restrictions. had been studied in the population setting by Lancaster already in the 1950s, but these results have not inspired practical algorithms. We revisit Lancaster's theory to devise a practical algorithm for nonparametric CCA (NCCA). Specifically, we show that the solution can be expressed in terms of the singular value decomposition of a certain operator associated with the joint density of the views. Thus, by estimating the population density from data, NCCA reduces to solving an eigenvalue system, superficially like kernel CCA but, importantly, without requiring the inversion of any kernel matrix. We also derive a partially linear CCA (PLCCA) variant in which one of the views undergoes a linear projection while the other is nonparametric. Using a kernel density estimate based on a small number of nearest neighbors, our NCCA and PLCCA algorithms are memory-efficient, often run much faster, and perform better than kernel CCA and comparable to deep CCA.
UR - http://www.scopus.com/inward/record.url?scp=84998817117&partnerID=8YFLogxK
M3 - منشور من مؤتمر
T3 - 33rd International Conference on Machine Learning, ICML 2016
SP - 2906
EP - 2916
BT - 33rd International Conference on Machine Learning, ICML 2016
A2 - Weinberger, Kilian Q.
A2 - Balcan, Maria Florina
T2 - 33rd International Conference on Machine Learning, ICML 2016
Y2 - 19 June 2016 through 24 June 2016
ER -