Abstract
Motivated by the error behavior in the DNA storage channel, in this paper, we extend the previously studied sequence reconstruction problem by Levenshtein. The reconstruction problem studies the model in which the information is read through multiple noisy channels, and the decoder, which receives all channel estimations, is required to decode the information. For the combinatorial setup, the assumption is that all the channels cause at most some t errors. Levenshtein considered the case in which all the channels have the same behavior, and we generalize this model and assume that the channels are not identical. Thus, different channels may cause different maximum numbers of errors. For example, we assume that there are N channels, which cause at most t1 or t2 errors, where t1 < t2, and the number of channels with at most t1 errors is at least ⌈pN⌉, for some fixed 0< p< 1. If the information codeword belongs to a code with minimum distance d , the problem is then to find the minimum number of channels that guarantees successful decoding in the worst case. A different problem we study in this paper is where the number of channels is fixed, and the question is finding the minimum distance d that provides exact reconstruction. We study these problems and show how to apply them for the cases of substitutions and transpositions.
Original language | English |
---|---|
Article number | 8419713 |
Pages (from-to) | 1267-1286 |
Number of pages | 20 |
Journal | IEEE Transactions on Information Theory |
Volume | 65 |
Issue number | 2 |
DOIs | |
State | Published - Feb 2019 |
Keywords
- DNA storage
- Hamming errors
- The sequence reconstruction problem
- the Johnson graph
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Library and Information Sciences