NearBucket-LSH: Efficient Similarity Search in P2P Networks: Efficient similarity search in P2P networks

Naama Kraus, David Carmel, Idit Keidar, Meni Orenbach

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

We present NearBucket-LSH, an effective algorithm for similarity search in large-scale distributed online social networks organized as peer-to-peer overlays. As communication is a dominant consideration in distributed systems, we focus on minimizing the network cost while guaranteeing good search quality. Our algorithm is based on Locality Sensitive Hashing (LSH), which limits the search to collections of objects, called buckets, that have a high probability to be similar to the query. More specifically, NearBucket-LSH employs an LSH extension that searches in near buckets, and improves search quality but also significantly increases the network cost. We decrease the network cost by considering the internals of both LSH and the P2P overlay, and harnessing their properties to our needs. We show that our NearBucket-LSH increases search quality for a given network cost compared to previous art. In many cases, the search quality increases by more than 50%.

Original languageEnglish
Title of host publicationSIMILARITY SEARCH AND APPLICATIONS, SISAP 2016
EditorsErich Schubert, Michael E. Houle, Laurent Amsaleg
Pages236-249
Number of pages14
Volume9939
DOIs
StatePublished - 2016
Event9th International Conference on Similarity Search and Applications, SISAP 2016 - Tokyo, Japan
Duration: 24 Oct 201626 Oct 2016

Publication series

NameLecture Notes in Computer Science

Conference

Conference9th International Conference on Similarity Search and Applications, SISAP 2016
Country/TerritoryJapan
CityTokyo
Period24/10/1626/10/16

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'NearBucket-LSH: Efficient Similarity Search in P2P Networks: Efficient similarity search in P2P networks'. Together they form a unique fingerprint.

Cite this