TY - GEN
T1 - Efficient Distributed Source Coding of Fragmented Genomic Sequencing Data
AU - Gershon, Yotam
AU - Cassuto, Yuval
N1 - Publisher Copyright: © 2021 IEEE.
PY - 2021/7/12
Y1 - 2021/7/12
N2 - In this paper we present a new compression scheme for genomic read data produced by modern sequencing technologies. In this setting, a reference genome similar to the one being sequenced is available only at the decoder, while the starting index of each read in this reference in unknown. The proposed scheme significantly reduces the encoding complexity relative to known reference-based compression schemes. The results include a code construction based on generalized concatenation coset codes, analysis of the decoding failure probability, and optimization of the scheme parameters for minimal compression rate.
AB - In this paper we present a new compression scheme for genomic read data produced by modern sequencing technologies. In this setting, a reference genome similar to the one being sequenced is available only at the decoder, while the starting index of each read in this reference in unknown. The proposed scheme significantly reduces the encoding complexity relative to known reference-based compression schemes. The results include a code construction based on generalized concatenation coset codes, analysis of the decoding failure probability, and optimization of the scheme parameters for minimal compression rate.
UR - http://www.scopus.com/inward/record.url?scp=85115059669&partnerID=8YFLogxK
U2 - 10.1109/ISIT45174.2021.9518191
DO - 10.1109/ISIT45174.2021.9518191
M3 - منشور من مؤتمر
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 3302
EP - 3307
BT - 2021 IEEE International Symposium on Information Theory, ISIT 2021 - Proceedings
T2 - 2021 IEEE International Symposium on Information Theory, ISIT 2021
Y2 - 12 July 2021 through 20 July 2021
ER -