Key-Locked Rank One Editing for Text-to-Image Personalization

Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Text-to-image models (T2I) offer a new level of flexibility by allowing users to guide the creative process through natural language. However, personalizing these models to align with user-provided visual concepts remains a challenging problem. The task of T2I personalization poses multiple hard challenges, such as maintaining high visual fidelity while allowing creative control, combining multiple personalized concepts in a single image, and keeping a small model size. We present Perfusion, a T2I personalization method that addresses these challenges using dynamic rank-1 updates to the underlying T2I model. Perfusion avoids overfitting by introducing a new mechanism that "locks"new concepts' cross-attention Keys to their superordinate category. Additionally, we develop a gated rank-1 approach that enables us to control the influence of a learned concept during inference time and to combine multiple concepts. This allows runtime efficient balancing of visual-fidelity and textual-alignment with a single 100KB trained model. Importantly, it can span different operating points across the Pareto front without additional training. We compare our approach to strong baselines and demonstrate its qualitative and quantitative strengths.

Original languageEnglish
Title of host publicationProceedings - SIGGRAPH 2023 Conference Papers
EditorsStephen N. Spencer
ISBN (Electronic)9798400701597
DOIs
StatePublished - 23 Jul 2023
Event2023 Special Interest Group on Computer Graphics and Interactive Techniques Conference, SIGGRAPH 2023 - Los Angeles, United States
Duration: 6 Aug 202310 Aug 2023

Publication series

NameProceedings - SIGGRAPH 2023 Conference Papers

Conference

Conference2023 Special Interest Group on Computer Graphics and Interactive Techniques Conference, SIGGRAPH 2023
Country/TerritoryUnited States
CityLos Angeles
Period6/08/2310/08/23

Keywords

  • Diffusion
  • Personalization
  • Rank-One
  • Text-to-Image

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition
  • Computer Graphics and Computer-Aided Design

Cite this