A guided FP-Growth algorithm for mining multitude-targeted item-sets and class association rules in imbalanced data

Lior Shabtay, Philippe Fournier-Viger, Rami Yaari, Itai Dattner

Research output: Contribution to journalArticlepeer-review

Abstract

Identifying frequent item-sets is a popular data-mining task. It consists of finding sets of items frequently appearing in data. Yet, finding all frequent item-sets in large or dense datasets may be time-consuming, and a user may be interested merely in some specific item-sets rather than all of them. Recently, methods have been proposed for targeted item-set mining; that is to calculate the support of some item-sets of interest. Though this approach is often more suitable for real applications than traditional item-set mining approaches, performance remains an issue. To address that issue, this paper presents a novel algorithm for multitude-targeted mining, named Guided Frequent Pattern-Growth (GFP-Growth). The GFP-Growth algorithm is designed to quickly mine a given set of item-sets using a small amount of memory. This paper proves that GFP-Growth yields the exact frequency-counts for each item-set of interest. It further shows that GFP-Growth can boost the performance for several problems requiring item-set mining. We specifically study the problem of generating minority-class rules from imbalanced data and develop the Minority-Report Algorithm (MRA) that uses GFP-Growth to solve this problem efficiently. We prove several theoretical properties of MRA and present experimental results showing substantial performance gain.

Original languageAmerican English
Pages (from-to)353-375
Number of pages23
JournalInformation Sciences
Volume553
DOIs
StatePublished - Apr 2021

Keywords

  • Data mining
  • Guided FP-Growth
  • Imbalanced data
  • Item-set discovery
  • Minority-class rule
  • Multi-targeted mining

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'A guided FP-Growth algorithm for mining multitude-targeted item-sets and class association rules in imbalanced data'. Together they form a unique fingerprint.

Cite this