Skip to main navigation Skip to search Skip to main content

Enhancing Diffusion-Based Image Synthesis with Robust Classifier Guidance

Research output: Contribution to journalArticlepeer-review

Abstract

Denoising diffusion probabilistic models (DDPMs) are a recent family of generative models that achieve state-of-the-art results. In order to obtain class-conditional generation, it was suggested to guide the diffusion process by gradients from a time-dependent classifier. While the idea is theoretically sound, deep learning-based classifiers are infamously susceptible to gradient-based adversarial attacks. Therefore, while traditional classifiers may achieve good accuracy scores, their gradients are possibly unreliable and might hinder the improvement of the generation results. Recent work discovered that adversarially robust classifiers exhibit gradients that are aligned with human perception, and these could better guide a generative process towards semantically meaningful images. We utilize this observation by defining and training a time-dependent adversarially robust classifier and use it as guidance for a generative diffusion model. In experiments on the highly challenging and diverse ImageNet dataset, our scheme introduces significantly more intelligible intermediate gradients, better alignment with theoretical findings, as well as improved generation results under several evaluation metrics. Furthermore, we conduct an opinion survey whose findings indicate that human raters prefer our method’s results.

Original languageEnglish
JournalTransactions on Machine Learning Research
Volume2023-March
StatePublished - 2023

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Enhancing Diffusion-Based Image Synthesis with Robust Classifier Guidance'. Together they form a unique fingerprint.

Cite this