Automatic Inference of Sequence from Low-Resolution Crystallographic Data

Ziv Ben-Aharon, Michael Levitt, Nir Kalisman

Research output: Contribution to journalArticlepeer-review


At resolutions worse than 3.5 Å, the electron density is weak or nonexistent at the locations of the side chains. Consequently, the assignment of the protein sequences to their correct positions along the backbone is a difficult problem. In this work, we propose a fully automated computational approach to assign sequence at low resolution. It is based on our surprising observation that standard reciprocal-space indicators, such as the initial unrefined R value, are sensitive enough to detect an erroneous sequence assignment of even a single backbone position. Our approach correctly determines the amino acid type for 15%, 13%, and 9% of the backbone positions in crystallographic datasets with resolutions of 4.0 Å, 4.5 Å, and 5.0 Å, respectively. We implement these findings in an application for threading a sequence onto a backbone structure. For the three resolution ranges, the application threads 83%, 81%, and 64% of the sequences exactly as in the deposited PDB structures. Ben-Aharon et al. find that certain crystallographic measures are more informative than previously assumed. They use these findings to solve a difficult technical problem in low-resolution crystallography: the identification of the amino acid types along the protein backbone.

Original languageAmerican English
Pages (from-to)1546-1554.e2
Issue number11
StatePublished - 6 Nov 2018


  • automatic threading
  • low-resolution crystallography
  • model building
  • reciprocal-space indicators

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Molecular Biology


Dive into the research topics of 'Automatic Inference of Sequence from Low-Resolution Crystallographic Data'. Together they form a unique fingerprint.

Cite this