GoSeed: Generating an optimal seeding plan for deduplicated storage

Aviv Nachman, Gala Yadgar, Sarai Sheinvald

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Deduplication decreases the physical occupancy of files in a storage volume by removing duplicate copies of data chunks, but creates data-sharing dependencies that complicate standard storage management tasks. Specifically, data migration plans must consider the dependencies between files that are remapped to new volumes and files that are not. Thus far, only greedy approaches have been suggested for constructing such plans, and it is unclear how they compare to one another and how much they can be improved. We set to bridge this gap for seeding-migration in which the target volume is initially empty. We present GoSeed, a formulation of seeding as an integer linear programming (ILP) problem, and three acceleration methods for applying it to realsized storage volumes. Our experimental evaluation shows that, while the greedy approaches perform well on "easy" problem instances, the cost of their solution can be significantly higher than that of GoSeed's solution on "hard" instances, for which they are sometimes unable to find a solution at all.

Original languageEnglish
Title of host publicationProceedings of the 18th USENIX Conference on File and Storage Technologies, FAST 2020
Pages193-207
Number of pages15
ISBN (Electronic)9781939133120
StatePublished - 2020
Event18th USENIX Conference on File and Storage Technologies, FAST 2020 - Santa Clara, United States
Duration: 25 Feb 202027 Feb 2020

Publication series

NameProceedings of the 18th USENIX Conference on File and Storage Technologies, FAST 2020

Conference

Conference18th USENIX Conference on File and Storage Technologies, FAST 2020
Country/TerritoryUnited States
CitySanta Clara
Period25/02/2027/02/20

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Software
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'GoSeed: Generating an optimal seeding plan for deduplicated storage'. Together they form a unique fingerprint.

Cite this