Streaming Pattern Matching with d Wildcards

Research output: Contribution to journalArticlepeer-review

Abstract

In the pattern matching with d wildcards problem one is given a text T of length n and a pattern P of length m that contains d wildcard characters, each denoted by a special symbol ‘?’. A wildcard character matches any other character. The goal is to establish for each m-length substring of T whether it matches P. In the streaming model variant of the pattern matching with d wildcards problem the text T arrives one character at a time and the goal is to report, before the next character arrives, if the last m characters match P while using only o(m) words of space. In this paper we introduce two new algorithms for the d wildcard pattern matching problem in the streaming model. The first is a randomized Monte Carlo algorithm that is parameterized by a constant 0 ≤ δ≤ 1. This algorithm uses O~ (d1-δ) amortized time per character and O~ (d1+δ) words of space. The second algorithm, which is used as a black box in the first algorithm, is a randomized Monte Carlo algorithm which uses O(d+ log m) worst-case time per character and O(dlog m) words of space.

Original languageEnglish
Pages (from-to)1988-2015
Number of pages28
JournalAlgorithmica
Volume81
Issue number5
DOIs
StatePublished - 15 May 2019

Keywords

  • Fingerprints
  • Pattern matching
  • Streaming algorithms
  • String combinatorics

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Computer Science Applications
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Streaming Pattern Matching with d Wildcards'. Together they form a unique fingerprint.

Cite this