Abstract
We present a deterministic black box solution for online approximate matching. Given a pattern of length m and a streaming text of length n that arrives one character at a time, the task is to report the distance between the pattern and a sliding window of the text as soon as the new character arrives. Our solution requires O(Σj=1log2mT(n,2 j-1)/n) time for each input character, where T(n,m) is the total running time of the best offline algorithm. The types of approximation that are supported include exact matching with wildcards, matching under the Hamming norm, approximating the Hamming norm, k-mismatch and numerical measures such as the L2 and L1 norms. For these examples, the resulting online algorithms take O(log2m), O(Σmlogm), O(log 2m/ε2), O(Σklogklogm), O(log2m) and O(Σmlogm) time per character, respectively. The space overhead is linear in the pattern size, which we show is optimal for any deterministic algorithm.
Original language | American English |
---|---|
Pages (from-to) | 731-736 |
Number of pages | 6 |
Journal | Information and Computation |
Volume | 209 |
Issue number | 4 |
DOIs | |
State | Published - 1 Apr 2011 |
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Information Systems
- Computer Science Applications
- Computational Theory and Mathematics