TESTING FOR DIFFERENTIAL ABUNDANCE IN COMPOSITIONAL COUNTS DATA, WITH APPLICATION TO MICROBIOME STUDIES

Barak Brill, Amnon Amir, Ruth Heller

Research output: Contribution to journalArticlepeer-review

Abstract

Identifying which taxa in our microbiota are associated with traits of interest is important for advancing science and health. However, the identification is challenging because the measured vector of taxa counts (by amplicon sequencing) is compositional, so a change in the abundance of one taxon in the microbiota induces a change in the number of sequenced counts across all taxa. The data are typically sparse, with many zero counts present either due to biological variance or limited sequencing depth. We examine the case of Crohn’s disease, where the microbial load changes substantially with the disease. For this representative example of a highly compositional setting, we show existing methods designed to identify differentially abundant taxa may have an inflated number of false positives. We introduce a novel nonpara-metric approach that provides valid inference, even when the fraction of zero counts is substantial. Our approach uses a set of reference taxa that are non-differentially abundant which can be estimated from the data or from outside information. Our approach also allows for a novel type of testing: multivariate tests of differential abundance over a focused subset of the taxa. Genera-level multivariate testing discovers additional genera as differentially abundant by avoiding agglomeration of taxa.

Original languageEnglish
Pages (from-to)2648-2671
Number of pages24
JournalAnnals of Applied Statistics
Volume16
Issue number4
DOIs
StatePublished - Dec 2022

Keywords

  • Compositional bias
  • analysis of composition
  • nonparamet-ric tests
  • normalization
  • rarefaction

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Modelling and Simulation
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'TESTING FOR DIFFERENTIAL ABUNDANCE IN COMPOSITIONAL COUNTS DATA, WITH APPLICATION TO MICROBIOME STUDIES'. Together they form a unique fingerprint.

Cite this