Abstract
Single-cell RNA sequencing (scRNA-seq) datasets contain true single cells, or singlets, in addition to cells that coalesce during the protocol, or doublets. Identifying singlets with high fidelity in scRNA-seq is necessary to avoid false negative and false positive discoveries. Although several methodologies have been proposed, they are typically tested on highly heterogeneous datasets and lack a priori knowledge of true singlets. Here, we leveraged datasets with synthetically introduced DNA barcodes for a hitherto unexplored application: to extract ground-truth singlets. We demonstrated the feasibility of our framework, “singletCode,” to evaluate existing doublet detection methods across a range of contexts. We also leveraged our ground-truth singlets to train a proof-of-concept machine learning classifier, which outperformed other doublet detection algorithms. Our integrative framework can identify ground-truth singlets and enable robust doublet detection in non-barcoded datasets.
Original language | English |
---|---|
Article number | 100592 |
Journal | Cell Genomics |
Volume | 4 |
Issue number | 7 |
DOIs | |
State | Published - 10 Jul 2024 |
Keywords
- barcoding
- benchmarking
- doublet detection
- lineage tracing
- machine learning
- scRNA-seq
- single-cell genomics
- singletCode
- singlets
All Science Journal Classification (ASJC) codes
- Biochemistry, Genetics and Molecular Biology (miscellaneous)
- Genetics