Abstract
In some multivariate problems with missing data, pairs of variables exist that are never observed together. For example, some modern biological tools can produce data of this form. As a result of this structure, the covariance matrix is only partially identifiable, and point estimation requires that identifying assumptions be made. These assumptions can introduce an unknown and potentially large bias into the inference. This paper presents a method based on semidefinite programming for automatically quantifying this potential bias by computing the range of possible equal-likelihood inferred values for convex functions of the covariance matrix. We focus on the bias of missing value imputation via conditional expectation and show that our method can give an accurate assessment of the true error in cases where estimates based on sampling uncertainty alone are overly optimistic.
Original language | English |
---|---|
Pages (from-to) | 529-546 |
Number of pages | 18 |
Journal | Computational Statistics |
Volume | 29 |
Issue number | 3-4 |
DOIs | |
State | Published - Jun 2014 |
Keywords
- Convex optimization
- EM algorithm
- Flow cytometry
- Mass cytometry
- Robust inference
- Semidefinite programming
- cyTOF
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty
- Computational Mathematics