TY - GEN
T1 - Sublinear Algorithms for Gap Edit Distance
AU - Goldenberg, Elazar
AU - Krauthgamer, Robert
AU - Saha, Barna
N1 - NA
PY - 2020/1/6
Y1 - 2020/1/6
N2 - The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. A simple dynamic programming computes the edit distance between two strings of length n in O(n2) time, and a more sophisticated algorithm runs in time O(n + t2) when the edit distance is t [Landau, Myers and Schmidt, SICOMP 1998]. In pursuit of obtaining faster running time, the last couple of decades have seen a flurry of research on approximating edit distance, including polylogarithmic approximation in near-linear time [Andoni, Krauthgamer and Onak, FOCS 2010], and a constant-factor approximation in subquadratic time [Chakrabarty, Das, Goldenberg, Koucḱy and Saks, FOCS 2018]. We study sublinear-Time algorithms for small edit distance, which was investigated extensively because of its numerous applications. Our main result is an algorithm for distinguishing whether the edit distance is at most t or at least t2 (the quadratic gap problem) in time Õ(n/t+t3). This time bound is sublinear roughly for all t in [ω(1), o(n1/3)], which was not known before. The best previous algorithms solve this problem in sublinear time only for t=ω(n1/3) [Andoni and Onak, STOC 2009]. Our algorithm is based on a new approach that adaptively switches between uniform sampling and reading contiguous blocks of the input strings. In contrast, all previous algorithms choose which coordinates to query non-Adaptively. Moreover, it can be extended to solve the t vs t2-ϵ gap problem in time Õ(n/t1-ϵ+t3).
AB - The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. A simple dynamic programming computes the edit distance between two strings of length n in O(n2) time, and a more sophisticated algorithm runs in time O(n + t2) when the edit distance is t [Landau, Myers and Schmidt, SICOMP 1998]. In pursuit of obtaining faster running time, the last couple of decades have seen a flurry of research on approximating edit distance, including polylogarithmic approximation in near-linear time [Andoni, Krauthgamer and Onak, FOCS 2010], and a constant-factor approximation in subquadratic time [Chakrabarty, Das, Goldenberg, Koucḱy and Saks, FOCS 2018]. We study sublinear-Time algorithms for small edit distance, which was investigated extensively because of its numerous applications. Our main result is an algorithm for distinguishing whether the edit distance is at most t or at least t2 (the quadratic gap problem) in time Õ(n/t+t3). This time bound is sublinear roughly for all t in [ω(1), o(n1/3)], which was not known before. The best previous algorithms solve this problem in sublinear time only for t=ω(n1/3) [Andoni and Onak, STOC 2009]. Our algorithm is based on a new approach that adaptively switches between uniform sampling and reading contiguous blocks of the input strings. In contrast, all previous algorithms choose which coordinates to query non-Adaptively. Moreover, it can be extended to solve the t vs t2-ϵ gap problem in time Õ(n/t1-ϵ+t3).
U2 - https://doi.org/10.1109/FOCS.2019.00070
DO - https://doi.org/10.1109/FOCS.2019.00070
M3 - منشور من مؤتمر
SN - 9781728149530
T3 - Annual IEEE Symposium on Foundations of Computer Science
SP - 1101
EP - 1120
BT - Proceedings - 2019 IEEE 60th Annual Symposium on Foundations of Computer Science, FOCS 2019
PB - IEEE Computer Society
T2 - 60th IEEE Annual Symposium on Foundations of Computer Science (FOCS)
Y2 - 9 November 2019 through 12 November 2019
ER -