Longest common extensions in trees

Philip Bille, Paweł Gawrychowski, Inge Li Gørtz, Gad M. Landau, Oren Weimann

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The longest common extension (LCE) of two indices in a string is the length of the longest identical substrings starting at these two indices. The LCE problem asks to preprocess a string into a compact data structure that supports fast LCE queries. In this paper we generalize the LCE problem to trees and suggest a few applications of LCE in trees to tries and XML databases. Given a labeled and rooted tree T of size n, the goal is to preprocess T into a compact data structure that support the following LCE queries between subpaths and subtrees in T. Let v1, v2, w1, and w2 be nodes of T such that w1 and w2 are descendants of v1 and v2 respectively. - LCEPP(v1, w1, v2, w2): (path-path LCE) return the longest common prefix of the paths v1 ⇝ w1 and v2 ⇝ w2. - LCEPT (v1, w1, v2): (path-tree LCE) return maximal path-path LCE of the path v1 ⇝ w1 and any path from v2 to a descendant leaf. - LCETT (v1, v2): (tree-tree LCE) return a maximal path-path LCE of any pair of paths from v1 and v2 to descendant leaves. We present the first non-trivial bounds for supporting these queries. For LCEPP queries, we present a linear-space solution with O(log n) query time. For LCEPT queries, we present a linear-space solution with O((log log n)2) query time, and complement this with a lower bound showing that any path-tree LCE structure of size O(n polylog(n)) must necessarily use Ω(log log n) time to answer queries. For LCETT queries, we present a time-space trade-off, that given any parameter τ, 1 ≤ τ ≤ n, leads to an O(nτ) space and O(n/τ) query-time solution. This is complemented with a reduction to the set intersection problem implying that a fast linear space solution is not likely to exist.

Original languageAmerican English
Title of host publicationCombinatorial Pattern Matching - 26th Annual Symposium, CPM 2015, Proceedings
EditorsUgo Vaccaro, Ely Porat, Ferdinando Cicalese
PublisherSpringer Verlag
Pages52-64
Number of pages13
ISBN (Print)9783319199283
DOIs
StatePublished - 2015
Event26th Annual Symposium on Combinatorial Pattern Matching, CPM 2015 - Ischia Island, Italy
Duration: 29 Jun 20151 Jul 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9133

Conference

Conference26th Annual Symposium on Combinatorial Pattern Matching, CPM 2015
Country/TerritoryItaly
CityIschia Island
Period29/06/151/07/15

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Longest common extensions in trees'. Together they form a unique fingerprint.

Cite this