Editing Implicit Assumptions in Text-to-Image Diffusion Models

Hadas Orgad, Bahjat Kawar, Yonatan Belinkov

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Text-to-image diffusion models often make implicit assumptions about the world when generating images. While some assumptions are useful (e.g., the sky is blue), they can also be outdated, incorrect, or reflective of social biases present in the training data. Thus, there is a need to control these assumptions without requiring explicit user input or costly re-training. In this work, we aim to edit a given implicit assumption in a pre-trained diffusion model. Our Text-to-Image Model Editing method, TIME for short, receives a pair of inputs: a "source"under-specified prompt for which the model makes an implicit assumption (e.g., "a pack of roses"), and a "destination"prompt that describes the same setting, but with a specified desired attribute (e.g., "a pack of blue roses"). TIME then updates the model's cross-attention layers, as these layers assign visual meaning to textual tokens. We edit the projection matrices in these layers such that the source prompt is projected close to the destination prompt. Our method is highly efficient, as it modifies a mere 2.2% of the model's parameters in under one second. To evaluate model editing approaches, we introduce TIMED (TIME Dataset), containing 147 source and destination prompt pairs from various domains. Our experiments (using Stable Diffusion) show that TIME is successful in model editing, generalizes well for related prompts unseen during editing, and imposes minimal effect on unrelated generations.

Original languageEnglish
Title of host publicationProceedings - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023
Pages7030-7038
Number of pages9
ISBN (Electronic)9798350307184
DOIs
StatePublished - 2023
Externally publishedYes
Event2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023 - Paris, France
Duration: 2 Oct 20236 Oct 2023

Publication series

NameProceedings of the IEEE International Conference on Computer Vision

Conference

Conference2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023
Country/TerritoryFrance
CityParis
Period2/10/236/10/23

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Editing Implicit Assumptions in Text-to-Image Diffusion Models'. Together they form a unique fingerprint.

Cite this