TY - JOUR
T1 - Cell-specific priors rescue differential gene expression in spatial spot-based technologies
AU - Nahman, Ornit
AU - Few-Cooper, Timothy J.
AU - Shen-Orr, Shai S.
N1 - Publisher Copyright: © The Author(s) 2024. Published by Oxford University Press.
PY - 2024/11/22
Y1 - 2024/11/22
N2 - Spatial transcriptomics (ST), a breakthrough technology, captures the complex structure and state of tissues through the spatial profiling of gene expression. A variety of ST technologies have now emerged, most prominently spot-based platforms such as Visium. Despite the widespread use of ST and its distinct data characteristics, the vast majority of studies continue to analyze ST data using algorithms originally designed for older technologies such as single-cell (SC) and bulk RNA-seq-particularly when identifying differentially expressed genes (DEGs). However, it remains unclear whether these algorithms are still valid or appropriate for ST data. Therefore, here, we sought to characterize the performance of these methods by constructing an in silico simulator of ST data with a controllable and known DEG ground truth. Surprisingly, our findings reveal little variation in the performance of classic DEG algorithms-all of which fail to accurately recapture known DEGs to significant levels. We further demonstrate that cellular heterogeneity within spots is a primary cause of this poor performance and propose a simple gene-selection scheme, based on prior knowledge of cell-type specificity, to overcome this. Notably, our approach outperforms existing data-driven methods designed specifically for ST data and offers improved DEG recovery and reliability rates. In summary, our work details a conceptual framework that can be used upstream, agnostically, of any DEG algorithm to improve the accuracy of ST analysis and any downstream findings.
AB - Spatial transcriptomics (ST), a breakthrough technology, captures the complex structure and state of tissues through the spatial profiling of gene expression. A variety of ST technologies have now emerged, most prominently spot-based platforms such as Visium. Despite the widespread use of ST and its distinct data characteristics, the vast majority of studies continue to analyze ST data using algorithms originally designed for older technologies such as single-cell (SC) and bulk RNA-seq-particularly when identifying differentially expressed genes (DEGs). However, it remains unclear whether these algorithms are still valid or appropriate for ST data. Therefore, here, we sought to characterize the performance of these methods by constructing an in silico simulator of ST data with a controllable and known DEG ground truth. Surprisingly, our findings reveal little variation in the performance of classic DEG algorithms-all of which fail to accurately recapture known DEGs to significant levels. We further demonstrate that cellular heterogeneity within spots is a primary cause of this poor performance and propose a simple gene-selection scheme, based on prior knowledge of cell-type specificity, to overcome this. Notably, our approach outperforms existing data-driven methods designed specifically for ST data and offers improved DEG recovery and reliability rates. In summary, our work details a conceptual framework that can be used upstream, agnostically, of any DEG algorithm to improve the accuracy of ST analysis and any downstream findings.
KW - deconvolution
KW - differentially expressed genes
KW - gene specificity
KW - spatial transcriptomics
UR - http://www.scopus.com/inward/record.url?scp=85212888609&partnerID=8YFLogxK
U2 - https://doi.org/10.1093/bib/bbae621
DO - https://doi.org/10.1093/bib/bbae621
M3 - مقالة
C2 - 39679437
SN - 1467-5463
VL - 26
JO - Briefings in Bioinformatics
JF - Briefings in Bioinformatics
IS - 1
ER -