TY - CHAP
T1 - Geometric Sketch
T2 - The Inflatable-Shrinkable Sketch
AU - Biton, Dvir
AU - Friedman, Roy
AU - Shahout, Rana
N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - A sketch is a probabilistic data-structure useful for collecting statistics about data streams, such as the frequency of each item. Sketches require sub-linear memory and can be updated and queried quickly. Their estimation errors are bounded to a known range with a certain probability, both determined by the sketch’s parameters, which allows users to tailor a sketch to their accuracy needs. Conventional sketches, such as count-sketch (CS) and count-min sketch (CMS) are configured with a fixed size, set during their initialization. That is, their parameters remain constant, while their absolute estimation error grows with the length of the stream. Unfortunately, this means that additional memory cannot be utilized to limit the estimation error buildup. Being able to dynamically increase and decrease the sketch’s memory usage is therefore attractive when the stream size is unknown in advance, or memory availability changes after initialization. To that end, we present Geometric Sketch (GS), a novel sketch that supports both increasing and decreasing memory usage at a granularity of a single counter, with a memory overhead of only 2 integers that are not counters. All of our code is open sourced [2].
AB - A sketch is a probabilistic data-structure useful for collecting statistics about data streams, such as the frequency of each item. Sketches require sub-linear memory and can be updated and queried quickly. Their estimation errors are bounded to a known range with a certain probability, both determined by the sketch’s parameters, which allows users to tailor a sketch to their accuracy needs. Conventional sketches, such as count-sketch (CS) and count-min sketch (CMS) are configured with a fixed size, set during their initialization. That is, their parameters remain constant, while their absolute estimation error grows with the length of the stream. Unfortunately, this means that additional memory cannot be utilized to limit the estimation error buildup. Being able to dynamically increase and decrease the sketch’s memory usage is therefore attractive when the stream size is unknown in advance, or memory availability changes after initialization. To that end, we present Geometric Sketch (GS), a novel sketch that supports both increasing and decreasing memory usage at a granularity of a single counter, with a memory overhead of only 2 integers that are not counters. All of our code is open sourced [2].
UR - http://www.scopus.com/inward/record.url?scp=105003409842&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-87766-7_24
DO - 10.1007/978-3-031-87766-7_24
M3 - فصل
T3 - Lecture Notes on Data Engineering and Communications Technologies
SP - 270
EP - 281
BT - Lecture Notes on Data Engineering and Communications Technologies
PB - Springer Science and Business Media Deutschland GmbH
ER -