Clustering-Correcting Codes

Tal Shinkar, Eitan Yaakobi, Andreas Lenz, Antonia Wachter-Zeh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

A new family of codes, called clustering-correcting codes, is presented in this paper. This family of codes is motivated by the special structure of data that is stored in DNA-based storage systems. The data stored in these systems has the form of unordered sequences, also called strands, and every strand is synthesized thousands to millions of times, where some of these copies are read back during sequencing. Due to the unordered structure of the strands, an important task in the decoding process is to place them in their correct order. This is usually accomplished by allocating a part of the strand for an index. However, in the presence of errors in the index field, important information on the order of the strands may be lost.Clustering-correcting codes ensure that if the distance between the index fields of two strands is small, then there will be a large distance between their data fields. It is shown how this property enables to place the strands together in their correct clusters even in the presence of errors. We present lower and upper bounds on the size of clustering-correcting codes and an explicit construction of these codes which uses only a single bit of redundancy.

Original languageEnglish
Title of host publication2019 IEEE International Symposium on Information Theory, ISIT 2019 - Proceedings
Pages81-85
Number of pages5
ISBN (Electronic)9781538692912
DOIs
StatePublished - Jul 2019
Event2019 IEEE International Symposium on Information Theory, ISIT 2019 - Paris, France
Duration: 7 Jul 201912 Jul 2019

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
Volume2019-July

Conference

Conference2019 IEEE International Symposium on Information Theory, ISIT 2019
Country/TerritoryFrance
CityParis
Period7/07/1912/07/19

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Information Systems
  • Modelling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Clustering-Correcting Codes'. Together they form a unique fingerprint.

Cite this