Abstract
DNA labeling is a powerful tool in molecular biology and biotechnology that allows for the visualization, detection, and study of DNA at the molecular level. Under this paradigm, a DNA molecule is being labeled by specific k patterns and is then imaged. Then, the resulting image is modeled as a (k + 1)-ary sequence in which any non-zero symbol indicates on the appearance of the corresponding label in the DNA molecule. The primary goal of this work is to study the labeling capacity, which is defined as the maximal information rate that can be obtained using this labeling process. The labeling capacity is computed for almost any pattern of a single label and several results for multiple labels are provided as well. Moreover, we provide the optimal minimal number of labels of length one or two, over any alphabet of size q, that are needed in order to achieve the maximum labeling capacity of log2(q). Lastly, we discuss the maximal labeling capacity that can be achieved using a certain number of labels of length two.
Original language | English |
---|---|
Journal | IEEE Transactions on Information Theory |
DOIs | |
State | Accepted/In press - 2025 |
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Library and Information Sciences