TY - GEN
T1 - Deleting and testing forbidden patterns in multi-dimensional arrays
AU - Ben-Eliezer, Omri
AU - Korman, Simon
AU - Reichman, Daniel
N1 - Publisher Copyright: © Omri Ben-Eliezer, Simon Korman, and Daniel Reichman;.
PY - 2017/7/1
Y1 - 2017/7/1
N2 - Analyzing multi-dimensional data is a fundamental problem in various areas of computer science. As the amount of data is often huge, it is desirable to obtain sublinear time algorithms to understand local properties of the data. We focus on the natural problem of testing pattern freeness: given a large d-dimensional array A and a fixed d-dimensional pattern P over a finite alphabet Γ, we say that A is P-free if it does not contain a copy of the forbidden pattern P as a consecutive subarray. The distance of A to P-freeness is the fraction of the entries of A that need to be modified to make it P-free. For any ϵ > 0 and any large enough pattern P over any alphabet - other than a very small set of exceptional patterns - we design a tolerant tester that distinguishes between the case that the distance is at least ϵ and the case that the distance is at most adϵ, with query complexity and running time cdϵ-1, where ad < 1 and cd depend only on the dimension d. These testers only need to access uniformly random blocks of samples from the input A. To analyze the testers we establish several combinatorial results, including the following ddimensional modification lemma, which might be of independent interest: For any large enough d-dimensional pattern P over any alphabet (excluding a small set of exceptional patterns for the binary case), and any d-dimensional array A containing a copy of P, one can delete this copy by modifying one of its locations without creating new P-copies in A. Our results address an open question of Fischer and Newman, who asked whether there exist efficient testers for properties related to tight substructures in multi-dimensional structured data.
AB - Analyzing multi-dimensional data is a fundamental problem in various areas of computer science. As the amount of data is often huge, it is desirable to obtain sublinear time algorithms to understand local properties of the data. We focus on the natural problem of testing pattern freeness: given a large d-dimensional array A and a fixed d-dimensional pattern P over a finite alphabet Γ, we say that A is P-free if it does not contain a copy of the forbidden pattern P as a consecutive subarray. The distance of A to P-freeness is the fraction of the entries of A that need to be modified to make it P-free. For any ϵ > 0 and any large enough pattern P over any alphabet - other than a very small set of exceptional patterns - we design a tolerant tester that distinguishes between the case that the distance is at least ϵ and the case that the distance is at most adϵ, with query complexity and running time cdϵ-1, where ad < 1 and cd depend only on the dimension d. These testers only need to access uniformly random blocks of samples from the input A. To analyze the testers we establish several combinatorial results, including the following ddimensional modification lemma, which might be of independent interest: For any large enough d-dimensional pattern P over any alphabet (excluding a small set of exceptional patterns for the binary case), and any d-dimensional array A containing a copy of P, one can delete this copy by modifying one of its locations without creating new P-copies in A. Our results address an open question of Fischer and Newman, who asked whether there exist efficient testers for properties related to tight substructures in multi-dimensional structured data.
KW - Pattern matching
KW - Property testing
KW - Sublinear algorithms
UR - http://www.scopus.com/inward/record.url?scp=85027272506&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.ICALP.2017.9
DO - 10.4230/LIPIcs.ICALP.2017.9
M3 - منشور من مؤتمر
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017
A2 - Muscholl, Anca
A2 - Indyk, Piotr
A2 - Kuhn, Fabian
A2 - Chatzigiannakis, Ioannis
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017
Y2 - 10 July 2017 through 14 July 2017
ER -