Tensor composition analysis detects cell-type specific associations in epigenetic studies

Elior Rahmani, Regev Schweiger, Saharon Rosset, Sriram Sankararaman, Eran Halperin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Identifying cell-type specific associations of genes with disease and mapping known associations to particular cell types is a key in understanding disease etiology. While developments in technologies for profiling genomic features such as gene expression and DNA methylation have led to the availability of large-scale tissue-specific genomic data, prohibitive costs drastically restrict collection of cell-type specific genomic data. This, in turn, limits the identification of disease-related genes and cell types. It is therefore desired to develop new approaches for detecting cell-type specific associations between phenotypes and tissue-specific genomic data. We suggest a new matrix factorization formulation, which allows us to deconvolve a two-dimensional input (observations by features) into a three-dimensional output. Traditional matrix factorization formulations essentially take as an input a multiple-source heterogeneous matrix of observations and output a matrix of source-specific weights and a matrix of source-specific features. We generalize this approach by assuming that source-specific features are unique for each observation rather than shared across all observations, and we propose Tensor Composition Analysis (TCA), a method for estimating observation- and source-specific values based on the model. We apply our model in the context of epigenetic association studies, where DNA methylation data measured from a heterogeneous tissue are often used, and we show that TCA allows us to extract cell-type specific methylation levels from two dimensional tissue-specific methylation data. We further derive a statistical test for detecting cell-type specific effects of methylation on phenotypes based on the TCA model, and using a simulation study we demonstrate its potentials and limitations. Finally, using five large whole-blood methylation datasets, we demonstrate that our model allows the detection of novel replicating cell-type specific associations without collecting cost prohibitive cell-type specific data, thus suggesting an exciting new opportunity to unveil more of the hidden signals in genomic association studies with potential design implications for future data collection efforts.

Original languageEnglish
Title of host publicationResearch in Computational Molecular Biology - 22nd Annual International Conference, RECOMB 2018, Proceedings
EditorsBenjamin J. Raphael
Pages274-275
Number of pages2
StatePublished - 2018
Event22nd International Conference on Research in Computational Molecular Biology, RECOMB 2018 - Paris, France
Duration: 21 Apr 201824 Apr 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10812 LNBI

Conference

Conference22nd International Conference on Research in Computational Molecular Biology, RECOMB 2018
Country/TerritoryFrance
CityParis
Period21/04/1824/04/18

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Tensor composition analysis detects cell-type specific associations in epigenetic studies'. Together they form a unique fingerprint.

Cite this