The Streaming k-Mismatch Problem: Tradeoffs between Space and Total Time

Shay Golan, Tomasz Kociumaka, Tsvi Kopelowitz, Ely Porat

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We revisit the k-mismatch problem in the streaming model on a pattern of length m and a streaming text of length n, both over a size-f alphabet. The current state-of-the-art algorithm for the streaming k-mismatch problem, by Clifford et al. [SODA 2019], uses∼O(k) space and∼O ôp k fworst-case time per character. The space complexity is known to be (unconditionally) optimal, and the worst-case time per character matches a conditional lower bound. However, there is a gap between the total time cost of the algorithm, which is∼O(n p k), and the fastest known offline algorithm, which costs∼O ô n + min ô pnk m,n time. Moreover, it is not known whether improvements over the∼O(n p k) total time are possible when using more than O(k) space. We address these gaps by designing a randomized streaming algorithm for the k-mismatch problem that, given an integer parameter k s m, uses∼O(s) space and costs∼O ô n + min ô nk2 m , pnk s nm s total time. For s = m, the total runtime becomes∼O ô n + min ô pnk m, fn f, which matches the time cost of the fastest offline algorithm. Moreover, the worst-case time cost per character is still∼O ôp k ff. 2012 ACM Subject Classification Theory of computation ! Pattern matching.

Original languageEnglish
Title of host publication31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020
EditorsInge Li Gortz, Oren Weimann
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
ISBN (Electronic)9783959771498
DOIs
StatePublished - 1 Jun 2020
Event31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020 - Copenhagen, Denmark
Duration: 17 Jun 202019 Jun 2020

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume161

Conference

Conference31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020
Country/TerritoryDenmark
CityCopenhagen
Period17/06/2019/06/20

Keywords

  • Hamming distance
  • K-mismatch
  • Streaming pattern matching

All Science Journal Classification (ASJC) codes

  • Software

Fingerprint

Dive into the research topics of 'The Streaming k-Mismatch Problem: Tradeoffs between Space and Total Time'. Together they form a unique fingerprint.

Cite this