TY - GEN

T1 - Range LCP queries revisited

AU - Amir, Amihood

AU - Lewenstein, Moshe

AU - Thankachan, Sharma V.

N1 - Publisher Copyright: © Springer International Publishing Switzerland 2015.

PY - 2015

Y1 - 2015

N2 - The Range LCP problem is to preprocess a string S[1…n], to enable efficient solutions of the following query: given a range [l, r] as the input, report maxi,j∈{l,…,r} |LCP(Si, Sj)|. Here LCP(Si, Sj) is the longest common prefix of the suffixes of S starting at locations i and j and |LCP(Si, Sj)| is its length. We study a natural extension of this problem, where the query consists of two ranges. Additionally, we allow a bounded number (say k ≥ 0) of mismatches in the LCP computation. Specifically, our task is to report the following when two ranges [ℓ1, r1] and [ℓ2, r2] comes as input: max {ℓ1≤i≤r1,ℓ2≤j≤r2} |LCPk(Si, Sj)| Here LCPk(Si, Sj) is the longest prefix of Si and Sj with at most k mismatches allowed. We show that the queries can be answered in O(k) time using an O(n2/w) space data structure, where w is the word size. We also present space efficient data structures for k = 0 and k = 1. For k = 0, we obtain a linear space data structure with query time O(√n/w logϵ n), where w is the word size and ϵ > 0 is an arbitrarily small constant. For the case k = 1 we obtain an O(n log n) space data structure with query time O(√ n log n). Finally, we give a reduction from Set Intersection to Range LCP queries, suggesting that it will be very difficult to improve our upper bound by more than a factor of O(logϵ n).

AB - The Range LCP problem is to preprocess a string S[1…n], to enable efficient solutions of the following query: given a range [l, r] as the input, report maxi,j∈{l,…,r} |LCP(Si, Sj)|. Here LCP(Si, Sj) is the longest common prefix of the suffixes of S starting at locations i and j and |LCP(Si, Sj)| is its length. We study a natural extension of this problem, where the query consists of two ranges. Additionally, we allow a bounded number (say k ≥ 0) of mismatches in the LCP computation. Specifically, our task is to report the following when two ranges [ℓ1, r1] and [ℓ2, r2] comes as input: max {ℓ1≤i≤r1,ℓ2≤j≤r2} |LCPk(Si, Sj)| Here LCPk(Si, Sj) is the longest prefix of Si and Sj with at most k mismatches allowed. We show that the queries can be answered in O(k) time using an O(n2/w) space data structure, where w is the word size. We also present space efficient data structures for k = 0 and k = 1. For k = 0, we obtain a linear space data structure with query time O(√n/w logϵ n), where w is the word size and ϵ > 0 is an arbitrarily small constant. For the case k = 1 we obtain an O(n log n) space data structure with query time O(√ n log n). Finally, we give a reduction from Set Intersection to Range LCP queries, suggesting that it will be very difficult to improve our upper bound by more than a factor of O(logϵ n).

UR - http://www.scopus.com/inward/record.url?scp=84944705594&partnerID=8YFLogxK

U2 - https://doi.org/10.1007/978-3-319-23826-5_33

DO - https://doi.org/10.1007/978-3-319-23826-5_33

M3 - منشور من مؤتمر

SN - 9783319238258

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 350

EP - 361

BT - String Processing and Information Retrieval - 22nd International Symposium, SPIRE 2015, Proceedings

A2 - Puglisi, Simon J.

A2 - Iliopoulos, Costas S.

A2 - Yilmaz, Emine

PB - Springer Verlag

T2 - 22nd International Symposium on String Processing and Information Retrieval, SPIRE 2015

Y2 - 1 September 2015 through 4 September 2015

ER -