TY - GEN
T1 - Using Generating Functions to Prove Additivity of Gene-Neighborhood Based Phylogenetics - Extended Abstract
AU - Katriel, Guy
AU - Mahanaymi, Udi
AU - Koutschan, Christoph
AU - Zeilberger, Doron
AU - Steel, Mike
AU - Snir, Sagi
N1 - Publisher Copyright: © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
PY - 2023
Y1 - 2023
N2 - Prokaryotic evolution is often described as the Spaghetti of Life due to massive genome dynamics (GD) events of gene gain and loss, resulting in different evolutionary histories for the set of genes comprising the organism. These different histories, dubbed as gene trees provide confounding signals, hampering the attempt to reconstruct the species tree describing the main trend of evolution of the species under study. The synteny index (SI) between a pair of genomes combines gene order and gene content information, allowing comparison of unequal gene content genomes, together with order considerations of their common genes. Recently, GD has been modelled as a continuous-time Markov process. Under this formulation, the distance between genes along the chromosome was shown to follow a birth-death-immigration process. Using classical results from birth-death theory, we recently showed that the SI measure is consistent under that formulation. In this work, we provide an alternative, stand alone combinatorial proof of the same result. By using generating function techniques we derive explicit expressions of the system’s probabilistic dynamics in the form of rational functions of the model parameters. This, in turn, allows us to infer analytically the expected distances between organisms based on a transformation of their SI. Although the expressions obtained are rather complex, we establish additivity of this estimated evolutionary distance (a desirable property yielding phylogenetic consistency). This approach relies on holonomic functions and the Zeilberger Algorithm in order to establish additivity of the transformation of SI.
AB - Prokaryotic evolution is often described as the Spaghetti of Life due to massive genome dynamics (GD) events of gene gain and loss, resulting in different evolutionary histories for the set of genes comprising the organism. These different histories, dubbed as gene trees provide confounding signals, hampering the attempt to reconstruct the species tree describing the main trend of evolution of the species under study. The synteny index (SI) between a pair of genomes combines gene order and gene content information, allowing comparison of unequal gene content genomes, together with order considerations of their common genes. Recently, GD has been modelled as a continuous-time Markov process. Under this formulation, the distance between genes along the chromosome was shown to follow a birth-death-immigration process. Using classical results from birth-death theory, we recently showed that the SI measure is consistent under that formulation. In this work, we provide an alternative, stand alone combinatorial proof of the same result. By using generating function techniques we derive explicit expressions of the system’s probabilistic dynamics in the form of rational functions of the model parameters. This, in turn, allows us to infer analytically the expected distances between organisms based on a transformation of their SI. Although the expressions obtained are rather complex, we establish additivity of this estimated evolutionary distance (a desirable property yielding phylogenetic consistency). This approach relies on holonomic functions and the Zeilberger Algorithm in order to establish additivity of the transformation of SI.
KW - Generating Functions
KW - Genome Dynamics
KW - Holonomic Functions
KW - Markovian Processes
KW - Phylogenetics
UR - http://www.scopus.com/inward/record.url?scp=85174274672&partnerID=8YFLogxK
U2 - 10.1007/978-981-99-7074-2_10
DO - 10.1007/978-981-99-7074-2_10
M3 - Conference contribution
SN - 9789819970735
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 120
EP - 135
BT - Bioinformatics Research and Applications - 19th International Symposium, ISBRA 2023, Proceedings
A2 - Guo, Xuan
A2 - Mangul, Serghei
A2 - Patterson, Murray
A2 - Zelikovsky, Alexander
PB - Springer Science and Business Media Deutschland GmbH
T2 - 19th International Symposium on Bioinformatics Research and Applications, ISBRA 2023
Y2 - 9 October 2023 through 12 October 2023
ER -