Skip to main navigation Skip to search Skip to main content

Curious Feature Selection-Based Clustering

Michal Moran, Goren Gordon

Research output: Contribution to journalArticlepeer-review

Abstract

In tabular data, certain challenges can negatively affect the quality of machine learning models, such as high dimensionality, noisy, irrelevant, and repetitive features, interactions between features and the fact that instances often come from different sources or distributions. Feature selection, instance selection and clustering algorithms address some of these challenges. Here, we propose a new holistic framework that assists in clarifying the structure of tabular datasets and enables the production of higher quality machine learning models. The framework, based on intrinsic-reward deep reinforcement learning loops, uses curious feature selection as the basis for clustering data instances, effectively creating blocks within the tabular data with the most relevant features for each cluster. The framework results in a clustering algorithm, wherein the instances are clustered based on their predicted optimal informative features. We show that this framework makes it possible to improve the accuracy of learning models on artificial and real datasets and to provide important insights into the data themselves.

Original languageEnglish
Pages (from-to)6146-6158
Number of pages13
JournalIEEE Transactions on Artificial Intelligence
Volume5
Issue number12
DOIs
StatePublished - 2024

Keywords

  • Clustering
  • curiosity loop
  • deep reinforcement learning (RL)
  • feature selection

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Curious Feature Selection-Based Clustering'. Together they form a unique fingerprint.

Cite this