Abstract
The sequence reconstruction problem, introduced by Levenshtein in 2001, considers a communication scenario where the sender transmits a codeword from some codebook and the receiver obtains multiple noisy reads of the codeword. Motivated by modern storage devices, we introduced a variant of the problem where the number of noisy reads N is fixed. Of significance, for the single-deletion channel, using log2log2 n +O(1) redundant bits, we designed a reconstruction code of length n that reconstructs codewords from two distinct noisy reads (Cai et al., 2021). In this work, we show that log2log2 n -O(1) redundant bits are necessary for such reconstruction codes, thereby, demonstrating the optimality of the construction. Furthermore, we show that these reconstruction codes can be used in t-deletion channels (with t ≥ qslant 2) to uniquely reconstruct codewords from nt-1/(t-1)!}+O ({nt-2) distinct noisy reads. For the two-deletion channel, using higher order VT syndromes and certain runlength constraints, we designed the class of higher order constrained shifted VT code with 2log2 n +o(log2(n)) redundancy bits that can reconstruct any codeword from any N ≥ 5 of its length-(n-2) subsequences.
Original language | English |
---|---|
Pages (from-to) | 7141-7158 |
Number of pages | 18 |
Journal | IEEE Transactions on Information Theory |
Volume | 68 |
Issue number | 11 |
DOIs | |
State | Published - 1 Nov 2022 |
Keywords
- DNA-based data storage
- deletion channel
- error-correcting code
- multiple reads
- sequence reconstruction problem
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Library and Information Sciences