CLOOB Conditioned Latent Diffusion training and inference code

Introduction

This repository contains the training code for CLOOB conditioned latent diffusion.
CCLD is similar in approach to the CLIP conditioned diffusion trained by
Katherine Crowson
with a few
key differences:

  • The use of latent diffusion cuts training costs by something like a factor of
    ten, allowing a high quality 1.2 billion parameter model to converge in as few as 5 days
    on a single 8x A100 pod.

  • CLOOB conditioning can take advantage of CLOOB’s unified latent space. CLOOB
    text and image embeds on the same inputs share a high similarity of somewhere around 0.9. This
    makes it possible to train the model without captions
    by using image embeds in the
    training loop and text embeds during inference.

This combination of

 

 

 

To finish reading, please visit source site