Multidimensional Scaling (MDS) is a classical technique for embedding data in low dimensions, still in widespread use today. In this paper we study MDS in a modern setting - specifically, high dimensions and ambient measurement noise. We show that as the ambient noise level increases, MDS suffers a sharp breakdown that depends on the data dimension and noise level, and derive an explicit formula for this breakdown point in the case of white noise. We then introduce MDS+, a simple variant of MDS, which applies a shrinkage nonlinearity to the eigenvalues of the MDS similarity matrix. Under a natural loss function measuring the embedding quality, we prove that MDS+ is the unique, asymptotically optimal shrinkage function. MDS+ offers improved embedding, sometimes significantly so, compared with MDS. Importantly, MDS+ calculates the optimal embedding dimension, into which the data should be embedded.
ASJC Scopus subject areas