Conformal Nucleus Sampling

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Language models generate text based on successively sampling the next word. A decoding procedure based on nucleus (top-p) sampling chooses from the smallest possible set of words whose cumulative probability exceeds the probability p. In this work, we assess whether a top-p set is indeed aligned with its probabilistic meaning in various linguistic contexts. We employ conformal prediction, a calibration procedure that focuses on the construction of minimal prediction sets according to a desired confidence level, to calibrate the parameter p as a function of the entropy of the next word distribution. We find that OPT models are overconfident, and that calibration shows a moderate inverse scaling with model size. https://github.com/shauli-ravfogel/conformal-prediction.

Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics, ACL 2023
PublisherAssociation for Computational Linguistics (ACL)
Pages27-34
Number of pages8
ISBN (Electronic)9781959429623
StatePublished - 2023
Event61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 - Toronto, Canada
Duration: 9 Jul 202314 Jul 2023

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics

Conference

Conference61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
Country/TerritoryCanada
CityToronto
Period9/07/2314/07/23

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Conformal Nucleus Sampling'. Together they form a unique fingerprint.

Cite this