TY - JOUR
T1 - Weighted pooling-practical and cost-effective techniques for pooled high-throughput sequencing
AU - Golan, David
AU - Erlich, Yaniv
AU - Rosset, Saharon
N1 - Funding Information: Funding: S.R. and D.G. were supported in part by an Open Collaborative Research grant from IBM and by Israeli Science Foundation grant 1227/09. D.G. was also supported in part by a fellowship from the Edmond J. Safra center for Bioinformatics at Tel-Aviv university. Y.E. was supported by an NIH grant (R21HG006167). Y.E. is an Andria and Paul Heafy Family Fellow.
PY - 2012/6
Y1 - 2012/6
N2 - Motivation: Despite the rapid decline in sequencing costs, sequencing large cohorts of individuals is still prohibitively expensive. Recently, several sophisticated pooling designs were suggested that can identify carriers of rare alleles in large cohorts with a significantly smaller number of pools, thus dramatically reducing the cost of such large-scale sequencing projects. These approaches use combinatorial pooling designs where each individual is either present or absent from a pool. One can then infer the number of carriers in a pool, and by combining information across pools, reconstruct the identity of the carriers. Results: We show that one can gain further efficiency and cost reduction by using 'weighted' designs, in which different individuals donate different amounts of DNA to the pools. Intuitively, in this situation, the number of mutant reads in a pool does not only indicate the number of carriers, but also their identity. We describe and study a powerful example of such weighted designs, using non-overlapping pools. We demonstrate that this approach is not only easier to implement and analyze but is also competitive in terms of accuracy with combinatorial designs when identifying rare variants, and is superior when sequencing common variants. We then discuss how weighting can be incorporated into existing combinatorial designs to increase their accuracy and demonstrate the resulting improvement using simulations. Finally, we argue that weighted designs have enough power to facilitate detection of common alleles, so they can be used as a cornerstone of wholeexome sequencing projects.
AB - Motivation: Despite the rapid decline in sequencing costs, sequencing large cohorts of individuals is still prohibitively expensive. Recently, several sophisticated pooling designs were suggested that can identify carriers of rare alleles in large cohorts with a significantly smaller number of pools, thus dramatically reducing the cost of such large-scale sequencing projects. These approaches use combinatorial pooling designs where each individual is either present or absent from a pool. One can then infer the number of carriers in a pool, and by combining information across pools, reconstruct the identity of the carriers. Results: We show that one can gain further efficiency and cost reduction by using 'weighted' designs, in which different individuals donate different amounts of DNA to the pools. Intuitively, in this situation, the number of mutant reads in a pool does not only indicate the number of carriers, but also their identity. We describe and study a powerful example of such weighted designs, using non-overlapping pools. We demonstrate that this approach is not only easier to implement and analyze but is also competitive in terms of accuracy with combinatorial designs when identifying rare variants, and is superior when sequencing common variants. We then discuss how weighting can be incorporated into existing combinatorial designs to increase their accuracy and demonstrate the resulting improvement using simulations. Finally, we argue that weighted designs have enough power to facilitate detection of common alleles, so they can be used as a cornerstone of wholeexome sequencing projects.
UR - http://www.scopus.com/inward/record.url?scp=84863529201&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/bts208
DO - 10.1093/bioinformatics/bts208
M3 - مقالة
SN - 1367-4803
VL - 28
SP - i197-i206
JO - Bioinformatics
JF - Bioinformatics
IS - 12
M1 - bts208
ER -