Abstract
Pathogenic variants underlying Mendelian diseases often disrupt the normal physiology of a few tissues and organs. However, variant effect prediction tools that aim to identify pathogenic variants are typically oblivious to tissue contexts. Here we report a machine-learning framework, denoted “Tissue Risk Assessment of Causality by Expression for variants” (TRACEvar, https://netbio.bgu.ac.il/TRACEvar/), that offers two advancements. First, TRACEvar predicts pathogenic variants that disrupt the normal physiology of specific tissues. This was achieved by creating 14 tissue-specific models that were trained on over 14,000 variants and combined 84 attributes of genetic variants with 495 attributes derived from tissue omics. TRACEvar outperformed 10 well-established and tissue-oblivious variant effect prediction tools. Second, the resulting models are interpretable, thereby illuminating variants’ mode of action. Application of TRACEvar to variants of 52 rare-disease patients highlighted pathogenicity mechanisms and relevant disease processes. Lastly, the interpretation of all tissue models revealed that top-ranking determinants of pathogenicity included attributes of disease-affected tissues, particularly cellular process activities. Collectively, these results show that tissue contexts and interpretable machine-learning models can greatly enhance the etiology of rare diseases.
Original language | American English |
---|---|
Pages (from-to) | 1187-1206 |
Number of pages | 20 |
Journal | Molecular Systems Biology |
Volume | 20 |
Issue number | 11 |
DOIs | |
State | Published - 4 Nov 2024 |
Keywords
- Genomic Medicine
- Machine Learning
- Tissue-selectivity
- Variant Effect Prediction
- Variant Interpretation
All Science Journal Classification (ASJC) codes
- Information Systems
- General Immunology and Microbiology
- Applied Mathematics
- General Biochemistry,Genetics and Molecular Biology
- General Agricultural and Biological Sciences
- Computational Theory and Mathematics