TY - GEN
T1 - Approximate cover of strings
AU - Amir, Amihood
AU - Levy, Avivit
AU - Lubin, Ronit
AU - Porat, Ely
N1 - Publisher Copyright: © Amihood Amir, Avivit Levy, Ronit Lubin, and Ely Porat.
PY - 2017/7/1
Y1 - 2017/7/1
N2 - Regularities in strings arise in various areas of science, including coding and automata theory, formal language theory, combinatorics, molecular biology and many others. A common notion to describe regularity in a string T is a cover, which is a string C for which every letter of T lies within some occurrence of C. The alignment of the cover repetitions in the given text is called a tiling. In many applications finding exact repetitions is not sufficient, due to the presence of errors. In this paper, we use a new approach for handling errors in coverable phenomena and define the approximate cover problem (ACP), in which we are given a text that is a sequence of some cover repetitions with possible mismatch errors, and we seek a string that covers the text with the minimum number of errors. We first show that the ACP is NP-hard, by studying the cover-size relaxation of the ACP, in which the requested size of the approximate cover is also given with the input string. We show this relaxation is already NP-hard. We also study another two relaxations of the ACP, which we call the partial-tiling relaxation of the ACP and the full-tiling relaxation of the ACP, in which a tiling of the requested cover is also given with the input string. A given full tiling retains all the occurrences of the cover before the errors, while in a partial tiling there can be additional occurrences of the cover that are not marked by the tiling. We show that the partial-tiling relaxation has a polynomial time complexity and give experimental evidence that the full-tiling also has polynomial time complexity. The study of these relaxations, besides shedding another light on the complexity of the ACP, also involves a deep understanding of the properties of covers, yielding some key lemmas and observations that may be helpful for a future study of regularities in the presence of errors.
AB - Regularities in strings arise in various areas of science, including coding and automata theory, formal language theory, combinatorics, molecular biology and many others. A common notion to describe regularity in a string T is a cover, which is a string C for which every letter of T lies within some occurrence of C. The alignment of the cover repetitions in the given text is called a tiling. In many applications finding exact repetitions is not sufficient, due to the presence of errors. In this paper, we use a new approach for handling errors in coverable phenomena and define the approximate cover problem (ACP), in which we are given a text that is a sequence of some cover repetitions with possible mismatch errors, and we seek a string that covers the text with the minimum number of errors. We first show that the ACP is NP-hard, by studying the cover-size relaxation of the ACP, in which the requested size of the approximate cover is also given with the input string. We show this relaxation is already NP-hard. We also study another two relaxations of the ACP, which we call the partial-tiling relaxation of the ACP and the full-tiling relaxation of the ACP, in which a tiling of the requested cover is also given with the input string. A given full tiling retains all the occurrences of the cover before the errors, while in a partial tiling there can be additional occurrences of the cover that are not marked by the tiling. We show that the partial-tiling relaxation has a polynomial time complexity and give experimental evidence that the full-tiling also has polynomial time complexity. The study of these relaxations, besides shedding another light on the complexity of the ACP, also involves a deep understanding of the properties of covers, yielding some key lemmas and observations that may be helpful for a future study of regularities in the presence of errors.
KW - Approximate cover
KW - Cover
KW - Periodicity
KW - Quasi-periodicity
UR - http://www.scopus.com/inward/record.url?scp=85027274231&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.CPM.2017.26
DO - 10.4230/LIPIcs.CPM.2017.26
M3 - منشور من مؤتمر
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017
A2 - Radoszewski, Jakub
A2 - Karkkainen, Juha
A2 - Rytter, Wojciech
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017
Y2 - 4 July 2017 through 6 July 2017
ER -