An information theory subspace analysis approach with application to anomaly detection ensembles

Marcelo Bacher, Irad Ben-Gal, Erez Shmueli

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Identifying anomalies in multi-dimensional datasets is an important task in many real-world applications. A special case arises when anomalies are occluded in a small set of attributes (i.e., subspaces) of the data and not necessarily over the entire data space. In this paper, we propose a new subspace analysis approach named Agglomerative Attribute Grouping (AAG) that aims to address this challenge by searching for subspaces that comprise highly correlative attributes. Such correlations among attributes represent a systematic interaction among the attributes that can better reflect the behavior of normal observations and hence can be used to improve the identification of future abnormal data samples. AAG relies on a novel multi-attribute metric derived from information theory measures of partitions to evaluate the "information distance" between groups of data attributes. The empirical evaluation demonstrates that AAG outperforms state-of-the-art subspace analysis methods, when they are used in anomaly detection ensembles, both in cases where anomalies are occluded in relatively small subsets of the available attributes and in cases where anomalies represent a new class (i.e., novelties). Finally, and in contrast to existing methods, AAG does not require any tuning of parameters.

Original languageEnglish
Title of host publicationIC3K 2017 - Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
EditorsAna Fred, Joaquim Filipe
Pages27-39
Number of pages13
DOIs
StatePublished - 2017
Event9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2017 - Funchal, Madeira, Portugal
Duration: 1 Nov 20173 Nov 2017

Publication series

NameIC3K 2017 - Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
Volume1

Conference

Conference9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2017
Country/TerritoryPortugal
CityFunchal, Madeira
Period1/11/173/11/17

Keywords

  • Anomaly detection
  • Ensemble
  • Rokhlin
  • Subspace analysis

All Science Journal Classification (ASJC) codes

  • Software

Fingerprint

Dive into the research topics of 'An information theory subspace analysis approach with application to anomaly detection ensembles'. Together they form a unique fingerprint.

Cite this