Target oriented network intelligence collection: effective exploration of social networks

Rami Puzis, Liron Kachko, Barak Hagbi, Roni Stern, Ariel Felner

Research output: Contribution to journalArticlepeer-review

Abstract

Target Oriented Network Intelligence Collection (TONIC) is a crawling process whose goal is to find social network profiles that contain information about a given target. Such profiles are called leads and the TONIC problem is how to minimize crawling costs incurred while finding them. We model this problem as a search problem in an unknown graph and present a best-first search approach for solving it. Three key challenges are (1) which profiles to consider crawling to, (2) how to prioritize the crawling order, and (3) when additional crawling is not worthwhile. For the first challenge, we propose two frameworks: the Restricted TONIC Framework (RTF), that restricts the search to immediate neighbors of previously found leads, and the Extended TONIC Framework (ETF), that extends the scope of the search to a wider neighborhood. Guidelines for when to choose which framework are provided. For the second challenge, we propose a set of effective topology-based heuristics that guide the search towards profiles that are more likely to be leads. For the third challenge, we propose to use data collected in previously executed crawls to learn when additional crawling is expected to be useful.

Original languageAmerican English
Pages (from-to)1447-1480
Number of pages34
JournalWorld Wide Web
Volume22
Issue number4
DOIs
StatePublished - 15 Jul 2019

Keywords

  • Artificial intelligence
  • Heuristic search
  • Online social networks

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Target oriented network intelligence collection: effective exploration of social networks'. Together they form a unique fingerprint.

Cite this