Abstract
The novel high-throughput technology of protein-binding microarrays (PBMs) measures binding intensity of a transcription factor to thousands of DNA probe sequences. Several algorithms have been developed to extract binding-site motifs from these data. Such motifs are commonly represented by positional weight matrices. Previous studies have shown that the motifs produced by these algorithms are either accurate in predicting in vitro binding or similar to previously published motifs, but not both. In this work, we present a new simple algorithm to infer binding-site motifs from PBM data. It outperforms prior art both in predicting in vitro binding and in producing motifs similar to literature motifs. Our results challenge previous claims that motifs with lower information content are better models for transcription-factor binding specificity. Moreover, we tested the effect of motif length and side positions flanking the "core" motif in the binding site. We show that side positions have a significant effect and should not be removed, as commonly done. A large drop in the results quality of all methods is observed between in vitro and in vivo binding prediction. The software is available on acgt.cs.tau.ac.il/rap.
| Original language | English |
|---|---|
| Pages (from-to) | 375-382 |
| Number of pages | 8 |
| Journal | Journal of Computational Biology |
| Volume | 20 |
| Issue number | 5 |
| DOIs | |
| State | Published - 1 May 2013 |
Keywords
- Motif finding
- Protein-binding microarray
- Protein-binding site
All Science Journal Classification (ASJC) codes
- Modelling and Simulation
- Molecular Biology
- Genetics
- Computational Mathematics
- Computational Theory and Mathematics