TY - GEN
T1 - Dictionary matching with one gap
AU - Amir, Amihood
AU - Levy, Avivit
AU - Porat, Ely
AU - Shalom, B. Riva
N1 - Funding Information: This research was supported by the Kabarnit Cyber consortium funded by the Chief Scientist in the Israeli Ministry of Economy under the Magnet Program.
PY - 2014
Y1 - 2014
N2 - The dictionary matching with gaps problem is to preprocess a dictionary D of d gapped patterns P 1,...,P d over alphabet ∑, where each gapped pattern P i is a sequence of subpatterns separated by bounded sequences of don't cares. Then, given a query text T of length n over alphabet ∑, the goal is to output all locations in T in which a pattern Pi ∈ D, 1 ≤ I ≤ d, ends. There is a renewed current interest in the gapped matching problem stemming from cyber security. In this paper we solve the problem where all patterns in the dictionary have one gap with at least α and at most β don't cares, where α and β are given parameters. Specifically, we show that the dictionary matching with a single gap problem can be solved in either O(d log d+D) time and O(dlog ε d+D) space, and query time O(n(β-α)loglogd log 2 min { d, log D }+occ), where occ is the number of patterns found, or preprocessing time: O(d2 ovr+ D ), where ovr is the maximal number of subpatterns including each other as a prefix or as a suffix, space: O(d 2+ D ), and query time O(n(β-α)+occ), where occ is the number of patterns found. As far as we know, this is the best solution for this setting of the problem, where many overlaps may exist in the dictionary.
AB - The dictionary matching with gaps problem is to preprocess a dictionary D of d gapped patterns P 1,...,P d over alphabet ∑, where each gapped pattern P i is a sequence of subpatterns separated by bounded sequences of don't cares. Then, given a query text T of length n over alphabet ∑, the goal is to output all locations in T in which a pattern Pi ∈ D, 1 ≤ I ≤ d, ends. There is a renewed current interest in the gapped matching problem stemming from cyber security. In this paper we solve the problem where all patterns in the dictionary have one gap with at least α and at most β don't cares, where α and β are given parameters. Specifically, we show that the dictionary matching with a single gap problem can be solved in either O(d log d+D) time and O(dlog ε d+D) space, and query time O(n(β-α)loglogd log 2 min { d, log D }+occ), where occ is the number of patterns found, or preprocessing time: O(d2 ovr+ D ), where ovr is the maximal number of subpatterns including each other as a prefix or as a suffix, space: O(d 2+ D ), and query time O(n(β-α)+occ), where occ is the number of patterns found. As far as we know, this is the best solution for this setting of the problem, where many overlaps may exist in the dictionary.
UR - http://www.scopus.com/inward/record.url?scp=84958549137&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-07566-2_2
DO - 10.1007/978-3-319-07566-2_2
M3 - منشور من مؤتمر
SN - 9783319075655
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 11
EP - 20
BT - Combinatorial Pattern Matching - 25th Annual Symposium, CPM 2014, Proceedings
PB - Springer Verlag
T2 - 25th Annual Symposium on Combinatorial Pattern Matching, CPM 2014
Y2 - 16 June 2014 through 18 June 2014
ER -