Coverage-Based Caching in Cloud Data Lakes

Grisha Weintraub, Ehud Gudes, Shlomi Dolev

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Cloud data lakes are a modern approach to handling large volumes of data. They separate the compute and storage layers, making them highly scalable and cost-effective. However, query performance in cloud data lakes could be faster, and various efforts have been made to enhance it in recent years. We introduce our approach to this problem, which is based on a novel caching technique where instead of caching actual data, we cache metadata called a coverage set.

Original languageAmerican English
Title of host publicationProceedings of the 17th ACM International Systems and Storage Conference, SYSTOR 2024
Pages193
Number of pages1
ISBN (Electronic)9798400711817
DOIs
StatePublished - 16 Sep 2024
Event17th ACM International Systems and Storage Conference, SYSTOR 2024 - Virtual, Online, Israel
Duration: 23 Sep 202424 Sep 2024

Publication series

NameProceedings of the 17th ACM International Systems and Storage Conference, SYSTOR 2024

Conference

Conference17th ACM International Systems and Storage Conference, SYSTOR 2024
Country/TerritoryIsrael
CityVirtual, Online
Period23/09/2424/09/24

Keywords

  • caching
  • cloud storage
  • data lakes

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Hardware and Architecture
  • Software
  • Electrical and Electronic Engineering

Cite this