Genomic Compression with Decoder Alignment under Single Deletion and Multiple Substitutions

Yotam Gershon, Yuval Cassuto

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We address the problem of compressing genomic read data produced by modern shotgun sequencing technologies, where a reference genome, closely similar to the sequenced one, is available only at the decoder. This problem, addressed by distributed source coding techniques, requires an alignment and validation layer in the decoder. In this work, we extend a previous work, to allow a single deletion along with the previously addressed multiple substitutions. The results include a new distance for efficient alignment under deletion and substitutions, a derivation of the exact distribution of this distance on random sequences, as well as procedures to recover the read from multiple invocations of a substitutions-only decoder.

Original languageEnglish
Title of host publication2022 IEEE International Symposium on Information Theory, ISIT 2022
Pages998-1003
Number of pages6
ISBN (Electronic)9781665421591
DOIs
StatePublished - 2022
Event2022 IEEE International Symposium on Information Theory, ISIT 2022 - Espoo, Finland
Duration: 26 Jun 20221 Jul 2022

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
Volume2022-June

Conference

Conference2022 IEEE International Symposium on Information Theory, ISIT 2022
Country/TerritoryFinland
CityEspoo
Period26/06/221/07/22

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Information Systems
  • Modelling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Genomic Compression with Decoder Alignment under Single Deletion and Multiple Substitutions'. Together they form a unique fingerprint.

Cite this