When to Act and When to Ask: Policy Learning With Deferral Under Hidden Confounding

Marah Ghoummaid, Uri Shalit

Research output: Contribution to journalConference articlepeer-review

Abstract

We consider the task of learning how to act in collaboration with a human expert based on observational data. The task is motivated by high-stake scenarios such as healthcare and welfare, where algorithmic action recommendations are made to a human expert, opening the option of deferring recommendation in cases where the human might act better on their own. This task is especially challenging when dealing with observational data, as using such data runs the risk of hidden confounders whose existence can lead to biased and harmful policies. However, unlike standard policy learning, the presence of a human expert can mitigate some of these risks. We build on the work of Mozannar and Sontag [2020] on consistent surrogate loss for learning with the option of deferral to an expert, where they solve a cost-sensitive supervised classification problem. Since we are solving a causal problem, where labels do not exist, we use a causal model to learn costs which are robust to a bounded degree of hidden confounding. We prove that our approach can take advantage of the strengths of both the model and the expert to obtain a better policy than either. We demonstrate our results by conducting experiments on synthetic and semi-synthetic data and show the advantages of our method compared to baselines.

Original languageEnglish
JournalAdvances in Neural Information Processing Systems
Volume37
StatePublished - 2024
Externally publishedYes
Event38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, Canada
Duration: 9 Dec 202415 Dec 2024

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'When to Act and When to Ask: Policy Learning With Deferral Under Hidden Confounding'. Together they form a unique fingerprint.

Cite this