TY - GEN
T1 - A Coding Theory Perspective on Multiplexed Molecular Profiling of Biological Tissues
AU - D'Alessio, Luca
AU - Liu, Litian
AU - Duffy, Ken
AU - Eldar, Yonina C.
AU - Medard, Muriel
AU - Babadi, Mehrtash
N1 - Publisher Copyright: © 2020 IEICE.
PY - 2020/10/24
Y1 - 2020/10/24
N2 - High-throughput and quantitative experimental technologies are experiencing rapid advances in the biological sciences. One important recent technique is multiplexed fluorescence in situ hybridization (mFISH), which enables the identification and localization of large numbers of individual strands of RNA within single cells. Core to that technology is a coding problem: with each RNA sequence of interest being a codeword, how to design a codebook of probes, and how to decode the resulting noisy measurements? Published work has relied on assumptions of uniformly distributed codewords and binary symmetric channels for decoding and to a lesser degree for code construction. Here we establish that both of these assumptions are inappropriate in the context of mFISH experiments and substantial decoding performance gains can be obtained by using more appropriate, less classical, assumptions. We propose a more appropriate asymmetric channel model that can be readily parameterized from data and use it to develop a maximum a posteriori (MAP) decoders. We show that false discovery rate for rare RNAs, which is the key experimental metric, is vastly improved with MAP decoders evenwhen employed with the existing sub-optimal codebook. Using an evolutionary optimization methodology, we further show that by permuting the codebook to better align with the prior, which is an experimentally straightforward procedure, significant further improvements are possible.
AB - High-throughput and quantitative experimental technologies are experiencing rapid advances in the biological sciences. One important recent technique is multiplexed fluorescence in situ hybridization (mFISH), which enables the identification and localization of large numbers of individual strands of RNA within single cells. Core to that technology is a coding problem: with each RNA sequence of interest being a codeword, how to design a codebook of probes, and how to decode the resulting noisy measurements? Published work has relied on assumptions of uniformly distributed codewords and binary symmetric channels for decoding and to a lesser degree for code construction. Here we establish that both of these assumptions are inappropriate in the context of mFISH experiments and substantial decoding performance gains can be obtained by using more appropriate, less classical, assumptions. We propose a more appropriate asymmetric channel model that can be readily parameterized from data and use it to develop a maximum a posteriori (MAP) decoders. We show that false discovery rate for rare RNAs, which is the key experimental metric, is vastly improved with MAP decoders evenwhen employed with the existing sub-optimal codebook. Using an evolutionary optimization methodology, we further show that by permuting the codebook to better align with the prior, which is an experimentally straightforward procedure, significant further improvements are possible.
UR - http://www.scopus.com/inward/record.url?scp=85102617689&partnerID=8YFLogxK
U2 - https://doi.org/10.34385/proc.65.B07-1
DO - https://doi.org/10.34385/proc.65.B07-1
M3 - منشور من مؤتمر
T3 - International Symposium on Information Theory and its Applications, ISITA
SP - 309
EP - 313
BT - 2020 International Symposium on Information Theory and Its Applications (ISITA)
T2 - 16th International Symposium on Information Theory and its Applications, ISITA 2020
Y2 - 24 October 2020 through 27 October 2020
ER -