TY - GEN

T1 - A linear time approximation scheme for maximum quartet consistency on sparse sampled inputs

AU - Snir, Sagi

AU - Yuster, Raphael

PY - 2011

Y1 - 2011

N2 - Phylogenetic tree reconstruction is a fundamental biological problem. Quartet amalgamation - combining a set of trees over four taxa into a tree over the full set - stands at the heart of many phylogenetic reconstruction methods. However, even reconstruction from a consistent set of quartet trees, i.e. all quartets agree with some tree, is NP-hard, and the best approximation ratio known is 1/3. For a dense input of Θ(n4) quartets (not necessarily consistent), the problem has a polynomial time approximation scheme. When the number of taxa grows, considering such dense inputs is impractical and some sampling approach is imperative. In this paper we show that if the number of quartets sampled is at least Θ(n2 log n), there is a randomized approximation scheme, that runs in linear time in the number of quartets. The previously known polynomial approximation scheme for that problem required a very dense sample of size Θ(n4). We note that samples of size Θ(n2 log n) are sparse in the full quartet set.

AB - Phylogenetic tree reconstruction is a fundamental biological problem. Quartet amalgamation - combining a set of trees over four taxa into a tree over the full set - stands at the heart of many phylogenetic reconstruction methods. However, even reconstruction from a consistent set of quartet trees, i.e. all quartets agree with some tree, is NP-hard, and the best approximation ratio known is 1/3. For a dense input of Θ(n4) quartets (not necessarily consistent), the problem has a polynomial time approximation scheme. When the number of taxa grows, considering such dense inputs is impractical and some sampling approach is imperative. In this paper we show that if the number of quartets sampled is at least Θ(n2 log n), there is a randomized approximation scheme, that runs in linear time in the number of quartets. The previously known polynomial approximation scheme for that problem required a very dense sample of size Θ(n4). We note that samples of size Θ(n2 log n) are sparse in the full quartet set.

UR - http://www.scopus.com/inward/record.url?scp=80052364613&partnerID=8YFLogxK

U2 - https://doi.org/10.1007/978-3-642-22935-0_29

DO - https://doi.org/10.1007/978-3-642-22935-0_29

M3 - Conference contribution

SN - 9783642229343

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 339

EP - 350

BT - Approximation, Randomization, and Combinatorial Optimization

T2 - 14th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2011 and the 15th International Workshop on Randomization and Computation, RANDOM 2011

Y2 - 17 August 2011 through 19 August 2011

ER -