Flexible caching in trie joins

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

While traditional algorithms for multiway join are based on reordering binary joins, more recent approaches have instantiated a new breed of “worst-case-optimal” in-memory algorithms wherein all relations are scanned simultaneously. Veldhuizen’s Leapfrog Trie Join (LFTJ) is an example. An important advantage of LFTJ is its small memory footprint, due to the fact that intermediate results are full tuples that can be dumped immediately. However, since the algorithm does not store intermediate results, recurring joins must be reconstructed from the source relations, resulting in excessive memory traffic. In this paper, we address this problem by incorporating caches into LFTJ. We do so by adopting recent developments in join optimization, tying variable ordering to a tree decomposition of the query. While the traditional usage of tree decomposition computes the entire result for each bag, our proposed approach incorporates caching directly into LFTJ and can dynamically adjust the size of the cache. Consequently, our solution balances between memory usage and repeated computation. Our experimental study over the SNAP dataset compares between various (traditional and novel) caching policies, and shows significant speedups over state-of-the-art algorithms on both join evaluation and join counting.

Original languageEnglish
Title of host publicationAdvances in Database Technology - EDBT 2017
Subtitle of host publication20th International Conference on Extending Database Technology, Proceedings
EditorsBernhard Mitschang, Volker Markl, Sebastian Bress, Periklis Andritsos, Kai-Uwe Sattler, Salvatore Orlando
Pages282-293
Number of pages12
ISBN (Electronic)9783893180738
DOIs
StatePublished - 2017
Event20th International Conference on Extending Database Technology, EDBT 2017 - Venice, Italy
Duration: 21 Mar 201724 Mar 2017

Publication series

NameAdvances in Database Technology - EDBT
Volume2017-March

Conference

Conference20th International Conference on Extending Database Technology, EDBT 2017
Country/TerritoryItaly
CityVenice
Period21/03/1724/03/17

Keywords

  • Caching
  • Databases
  • Tree decomposition
  • Trie joins

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Software
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Flexible caching in trie joins'. Together they form a unique fingerprint.

Cite this