Correcting Deletions With Multiple Reads

Johan Chrisnata, Han Mao Kiah, Eitan Yaakobi

Research output: Contribution to journalArticlepeer-review

Abstract

The sequence reconstruction problem, introduced by Levenshtein in 2001, considers a communication scenario where the sender transmits a codeword from some codebook and the receiver obtains multiple noisy reads of the codeword. Motivated by modern storage devices, we introduced a variant of the problem where the number of noisy reads N is fixed. Of significance, for the single-deletion channel, using log2log2 n +O(1) redundant bits, we designed a reconstruction code of length n that reconstructs codewords from two distinct noisy reads (Cai et al., 2021). In this work, we show that log2log2 n -O(1) redundant bits are necessary for such reconstruction codes, thereby, demonstrating the optimality of the construction. Furthermore, we show that these reconstruction codes can be used in t-deletion channels (with t ≥ qslant 2) to uniquely reconstruct codewords from nt-1/(t-1)!}+O ({nt-2) distinct noisy reads. For the two-deletion channel, using higher order VT syndromes and certain runlength constraints, we designed the class of higher order constrained shifted VT code with 2log2 n +o(log2(n)) redundancy bits that can reconstruct any codeword from any N ≥ 5 of its length-(n-2) subsequences.

Original languageEnglish
Pages (from-to)7141-7158
Number of pages18
JournalIEEE Transactions on Information Theory
Volume68
Issue number11
DOIs
StatePublished - 1 Nov 2022

Keywords

  • DNA-based data storage
  • deletion channel
  • error-correcting code
  • multiple reads
  • sequence reconstruction problem

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Correcting Deletions With Multiple Reads'. Together they form a unique fingerprint.

Cite this