Comparative study of the effectiveness and limitations of current methods for detecting sequence coevolution

WZ Mao, C Kaya, A Dutta, Amnon Horovitz, I Bahar

Research output: Contribution to journalArticlepeer-review

Abstract

Motivation: With rapid accumulation of sequence data on several species, extracting rational and systematic information from multiple sequence alignments (MSAs) is becoming increasingly important. Currently, there is a plethora of computational methods for investigating coupled evolutionary changes in pairs of positions along the amino acid sequence, and making inferences on structure and function. Yet, the significance of coevolution signals remains to be established. Also, a large number of false positives (FPs) arise from insufficient MSA size, phylogenetic background and indirect couplings. Results: Here, a set of 16 pairs of non-interacting proteins is thoroughly examined to assess the effectiveness and limitations of different methods. The analysis shows that recent computationally expensive methods designed to remove biases from indirect couplings outperform others in detecting tertiary structural contacts as well as eliminating intermolecular FPs; whereas traditional methods such as mutual information benefit from refinements such as shuffling, while being highly efficient. Computations repeated with 2,330 pairs of protein families from the Negatome database corroborated these results. Finally, using a training dataset of 162 families of proteins, we propose a combined method that outperforms existing individual methods. Overall, the study provides simple guidelines towards the choice of suitable methods and strategies based on available MSA size and computing resources.

Original languageEnglish
Pages (from-to)1929-1937
Number of pages9
JournalBioinformatics
Volume31
Issue number12
DOIs
StatePublished - 15 Jun 2015

All Science Journal Classification (ASJC) codes

  • Computational Mathematics
  • Molecular Biology
  • Biochemistry
  • Statistics and Probability
  • Computer Science Applications
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Comparative study of the effectiveness and limitations of current methods for detecting sequence coevolution'. Together they form a unique fingerprint.

Cite this