Skip to main navigation Skip to search Skip to main content

Cost-Effective LLM Utilization for Machine Learning Tasks over Tabular Data

Yael Einy, Tova Milo, Slava Novgorodov

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Classic machine learning (ML) models excel in modeling tabular datasets but lack broader world knowledge due to the absence of pre-training, an area where Large Language Models (LLMs) stand out. This paper presents an effective method that bridges the gap, leveraging LLMs to enrich tabular data to enhance the performance of classical ML models. Despite the previously limited success of direct LLM application to tabular tasks due to their high computational demands, our approach selectively enriches datasets with essential world knowledge, balancing performance improvement with cost-effectiveness. This work advances the capabilities of traditional ML models and opens new avenues for research at the convergence of classical ML and LLMs, marking the onset of a new era in cost-effective data enrichment.

Original languageEnglish
Title of host publication1st Workshop on Governance, Understanding and Integration of Data for Effective and Responsible AI, GUIDE-AI 2024, Co-located with SIGMOD 2024
Pages45-49
Number of pages5
ISBN (Electronic)9798400706943
DOIs
StatePublished - 9 Jun 2024
Event1st Workshop on Governance, Understanding and Integration of Data for Effective and Responsible AI, GUIDE-AI 2024 - Santiago, Chile
Duration: 14 Jun 202414 Jun 2024

Publication series

Name1st Workshop on Governance, Understanding and Integration of Data for Effective and Responsible AI, GUIDE-AI 2024, Co-located with SIGMOD 2024

Conference

Conference1st Workshop on Governance, Understanding and Integration of Data for Effective and Responsible AI, GUIDE-AI 2024
Country/TerritoryChile
CitySantiago
Period14/06/2414/06/24

Keywords

  • Data Enrichment
  • Data Integration
  • Large Language Models

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Information Systems
  • Software

Fingerprint

Dive into the research topics of 'Cost-Effective LLM Utilization for Machine Learning Tasks over Tabular Data'. Together they form a unique fingerprint.

Cite this