TY - JOUR
T1 - Face dissimilarity judgments are predicted by representational distance in morphable and image-computable models
AU - Jozwik, Kamila M.
AU - O’Keeffe, Jonathan
AU - Storrs, Katherine R.
AU - Guo, Wenxuan
AU - Golan, Tal
AU - Kriegeskorte, Nikolaus
N1 - Funding Information: This research was supported by the Wellcome Trust Grant 206521/Z/17/Z (to K.M.J.), the Alexander von Humboldt Foundation postdoctoral fellowship (to K.M.J.), the Alexander von Humboldt Foundation postdoctoral fellowship (to K.R.S.), the Hessian Ministry of Science and Arts cluster project “The Adaptive Mind,” the Wellcome Trust, and the MRC Cognition and Brain Sciences Unit. This publication was made possible in part with the support of the Charles H. Revson Foundation (T.G.). The statements made and views expressed, however, are solely the responsibility of the authors. Funding Information: ACKNOWLEDGMENTS. This research was supported by the Wellcome Trust Grant 206521/Z/17/Z (to K.M.J.), the Alexander von Humboldt Foundation post-doctoralfellowship(toK.M.J.),theAlexandervonHumboldtFoundationpostdoc-toralfellowship(toK.R.S.),theHessianMinistryof ScienceandArtsclusterproject “The Adaptive Mind,” the Wellcome Trust, and the MRC Cognition and Brain Sciences Unit. This publication was made possible in part with the support of the Publisher Copyright: Copyright © 2022 the Author(s).
PY - 2022/7/5
Y1 - 2022/7/5
N2 - Human vision is attuned to the subtle differences between individual faces. Yet we lack a quantitative way of predicting how similar two face images look and whether they appear to show the same person. Principal component–based three-dimensional (3D) morphable models are widely used to generate stimuli in face perception research. These models capture the distribution of real human faces in terms of dimensions of physical shape and texture. How well does a “face space” based on these dimensions capture the similarity relationships humans perceive among faces? To answer this, we designed a behavioral task to collect dissimilarity and same/different identity judgments for 232 pairs of realistic faces. Stimuli sampled geometric relationships in a face space derived from principal components of 3D shape and texture (Basel face model [BFM]). We then compared a wide range of models in their ability to predict the data, including the BFM from which faces were generated, an active appearance model derived from face photographs, and image-computable models of visual perception. Euclidean distance in the BFM explained both dissimilarity and identity judgments surprisingly well. In a comparison against 16 diverse models, BFM distance was competitive with representational distances in state-of-the-art deep neural networks (DNNs), including novel DNNs trained on BFM synthetic identities or BFM latents. Models capturing the distribution of face shape and texture across individuals are not only useful tools for stimulus generation. They also capture important information about how faces are perceived, suggesting that human face representations are tuned to the statistical distribution of faces.
AB - Human vision is attuned to the subtle differences between individual faces. Yet we lack a quantitative way of predicting how similar two face images look and whether they appear to show the same person. Principal component–based three-dimensional (3D) morphable models are widely used to generate stimuli in face perception research. These models capture the distribution of real human faces in terms of dimensions of physical shape and texture. How well does a “face space” based on these dimensions capture the similarity relationships humans perceive among faces? To answer this, we designed a behavioral task to collect dissimilarity and same/different identity judgments for 232 pairs of realistic faces. Stimuli sampled geometric relationships in a face space derived from principal components of 3D shape and texture (Basel face model [BFM]). We then compared a wide range of models in their ability to predict the data, including the BFM from which faces were generated, an active appearance model derived from face photographs, and image-computable models of visual perception. Euclidean distance in the BFM explained both dissimilarity and identity judgments surprisingly well. In a comparison against 16 diverse models, BFM distance was competitive with representational distances in state-of-the-art deep neural networks (DNNs), including novel DNNs trained on BFM synthetic identities or BFM latents. Models capturing the distribution of face shape and texture across individuals are not only useful tools for stimulus generation. They also capture important information about how faces are perceived, suggesting that human face representations are tuned to the statistical distribution of faces.
KW - Basel face model
KW - deep neural networks
KW - face identification
KW - face perception
KW - face similarity
UR - http://www.scopus.com/inward/record.url?scp=85133147231&partnerID=8YFLogxK
U2 - https://doi.org/10.1073/pnas.2115047119
DO - https://doi.org/10.1073/pnas.2115047119
M3 - Article
C2 - 35767642
SN - 0027-8424
VL - 119
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 27
M1 - e2115047119
ER -