Abstract
Scientists use high-dimensional measurement assays to detect and prioritize regions of strong signal in spatially organized domain. Examples include finding methylation-enriched genomic regions using microarrays, and active cortical areas using brain-imaging. The most common procedure for detecting potential regions is to group neighboring sites where the signal passed a threshold. However, one needs to account for the selection bias induced by this procedure to avoid diminishing effects when generalizing to a population. This article introduces pin-down inference, a model and an inference framework that permit population inference for these detected regions. Pin-down inference provides nonasymptotic point and confidence interval estimators for the mean effect in the region that account for local selection bias. Our estimators accommodate nonstationary covariances that are typical of these data, allowing researchers to better compare regions of different sizes and correlation structures. Inference is provided within a conditional one-parameter exponential family per region, with truncations that match the selection constraints. A secondary screening-and-adjustment step allows pruning the set of detected regions, while controlling the false-coverage rate over the reported regions. We apply the method to genomic regions with differing DNA-methylation rates across tissue. Our method provides superior power compared to other conditional and nonparametric approaches. Supplementary materials for this article are available online.
Original language | English |
---|---|
Pages (from-to) | 1351-1365 |
Number of pages | 15 |
Journal | Journal of the American Statistical Association |
Volume | 114 |
Issue number | 527 |
DOIs | |
State | Published - 3 Jul 2019 |
Keywords
- Bump-hunting
- Conditional inference
- DNA-methylation
- Nonstationary process
- Selective inference
- Spatial statistics
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty