TY - GEN

T1 - An almost optimal edit distance oracle

AU - Charalampopoulos, Panagiotis

AU - Gawrychowski, Paweł

AU - Mozes, Shay

AU - Weimann, Oren

N1 - Funding Information: Funding Panagiotis Charalampopoulos: Supported by Israel Science Foundation grant 592/17. Shay Mozes: Partially supported by Israel Science Foundation grant 592/17. Oren Weimann: Partially supported by Israel Science Foundation grant 592/17. Publisher Copyright: © 2021 Panagiotis Charalampopoulos, Paweł Gawrychowski, Shay Mozes, and Oren Weimann.

PY - 2021/7/1

Y1 - 2021/7/1

N2 - We consider the problem of preprocessing two strings S and T, of lengths m and n, respectively, in order to be able to efficiently answer the following queries: Given positions i, j in S and positions a, b in T, return the optimal alignment score of S[i..j] and T[a..b]. Let N = mn. We present an oracle with preprocessing time N1+o(1) and space N1+o(1) that answers queries in log2+o(1) N time. In other words, we show that we can efficiently query for the alignment score of every pair of substrings after preprocessing the input for almost the same time it takes to compute just the alignment of S and T. Our oracle uses ideas from our distance oracle for planar graphs [STOC 2019] and exploits the special structure of the alignment graph. Conditioned on popular hardness conjectures, this result is optimal up to subpolynomial factors. Our results apply to both edit distance and longest common subsequence (LCS). The best previously known oracle with construction time and size O(N) has slow Ω(√ N) query time [Sakai, TCS 2019], and the one with size N1+o(1) and query time log2+o(1) N (using a planar graph distance oracle) has slow Ω(N3/2) construction time [Long & Pettie, SODA 2021]. We improve both approaches by roughly a √ N factor.

AB - We consider the problem of preprocessing two strings S and T, of lengths m and n, respectively, in order to be able to efficiently answer the following queries: Given positions i, j in S and positions a, b in T, return the optimal alignment score of S[i..j] and T[a..b]. Let N = mn. We present an oracle with preprocessing time N1+o(1) and space N1+o(1) that answers queries in log2+o(1) N time. In other words, we show that we can efficiently query for the alignment score of every pair of substrings after preprocessing the input for almost the same time it takes to compute just the alignment of S and T. Our oracle uses ideas from our distance oracle for planar graphs [STOC 2019] and exploits the special structure of the alignment graph. Conditioned on popular hardness conjectures, this result is optimal up to subpolynomial factors. Our results apply to both edit distance and longest common subsequence (LCS). The best previously known oracle with construction time and size O(N) has slow Ω(√ N) query time [Sakai, TCS 2019], and the one with size N1+o(1) and query time log2+o(1) N (using a planar graph distance oracle) has slow Ω(N3/2) construction time [Long & Pettie, SODA 2021]. We improve both approaches by roughly a √ N factor.

KW - Edit distance

KW - Longest common subsequence

KW - Planar graphs

KW - Voronoi diagrams

UR - http://www.scopus.com/inward/record.url?scp=85115310343&partnerID=8YFLogxK

U2 - https://doi.org/10.4230/LIPIcs.ICALP.2021.48

DO - https://doi.org/10.4230/LIPIcs.ICALP.2021.48

M3 - Conference contribution

T3 - Leibniz International Proceedings in Informatics, LIPIcs

BT - 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021

A2 - Bansal, Nikhil

A2 - Merelli, Emanuela

A2 - Worrell, James

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

T2 - 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021

Y2 - 12 July 2021 through 16 July 2021

ER -