Abstract
Deep Neural Networks (DNNs) have become effective for various machine learning tasks. DNNs are known to achieve high accuracy with unstructured data in which each data sample (e.g., image) consists of many raw features (e.g., pixels) of the same type. The effectiveness of this approach diminishes for structured (tabular) data. In most cases, decision tree-based models such as Random Forest (RF) or Gradient Boosting Decision Trees (GBDT) outperform DNNs. In addition, DNNs tend to perform poorly when the number of samples in the dataset is small. This paper introduces Transfer Learning for Tabular Data (TLTD) which utilizes a novel learning architecture designed to extract new features from structured datasets. Using the DNN's learning capabilities on images, we convert the tabular data into images, then use the distillation technique to achieve better learning. We evaluated our approach with 25 structured datasets, and compared the outcomes to those of RF, eXtreme Gradient Boosting (XGBoost), Tabnet, KNN, and TabPFN. The results demonstrate the usefulness of the TLTD approach.
Original language | American English |
---|---|
Article number | 110748 |
Journal | Applied Soft Computing |
Volume | 147 |
DOIs | |
State | Published - 1 Nov 2023 |
Keywords
- Deep-learning
- Feature-extraction
- Tabular datasets
All Science Journal Classification (ASJC) codes
- Software