Bounds for overlapping interval join on MapReduce

Foto Afrati, Shlomi Dolev, Shantanu Sharma, Jeffrey D. Ullman

Research output: Contribution to journalConference articlepeer-review

Abstract

We consider the problem of 2-way interval join, where we want to find all pairs of overlapping intervals, i.e., intervals that share at least one point in common. We present lower and upper bounds on the replication rate for this problem when it is implemented in MapReduce. We study three cases, where intervals in the input are: (i) unit-length and equally-spaced, (ii) variable-length and equally-spaced, and (iii) equally-spaced with specific distribution of the various lengths. Our algorithms offer intuition as how to build algorithms for other cases, especially when we have some statistical knowledge about the distribution of the lengths of the intervals. E.g., if mostly large intervals interact with small intervals and not within themselves, then we believe our techniques can be extended to achieve better replication rate.

Original languageAmerican English
Pages (from-to)3-6
Number of pages4
JournalCEUR Workshop Proceedings
Volume1330
StatePublished - 1 Jan 2015
EventJoint Workshops of the International Conference on Extending Database Technology and the International Conference on Database Theory, EDBT/ICDT-WS 2015 - Brussels, Belgium
Duration: 27 Mar 2015 → …

All Science Journal Classification (ASJC) codes

  • General Computer Science

Cite this