LAFITE: Towards Language-Free Training for Text-to-Image Generation

Code for paper LAFITE: Towards Language-Free Training for Text-to-Image Generation (CVPR 2022) Update more details later. Requirements The implementation is based on stylegan2-ada-pytorch and CLIP, the required packages can be found in the links. Preparing Datasets Example: python dataset_tool.py –source=./path_to_some_dataset/ –dest=./datasets/some_dataset.zip –width=256 –height=256 –transform=center-crop the files at ./path_to_some_dataset/ should be like: ./path_to_some_dataset/   ├  1.png   ├  1.txt   ├  2.png   ├  2.txt   ├  … We provide links to several commonly used datasets that we have already processed (with CLIP-ViT/B-32): MS-COCO Training Set […]

Read more

Vector Quantized Diffusion Model for Text-to-Image Synthesis

Overview This is the official repo for the paper: Vector Quantized Diffusion Model for Text-to-Image Synthesis. VQ-Diffusion is based on a VQ-VAE whose latent space is modeled by a conditional variant of the recently developed Denoising Diffusion Probabilistic Model (DDPM). It produces significantly better text-to-image generation results when compared with Autoregressive models with similar numbers of parameters. Compared with previous GAN-based methods, VQ-Diffusion can handle more complex scenes and improve the synthesized image quality by a large margin. Framework Requirements […]

Read more

Generate vibrant and detailed images using only text

CLIP Guided Diffusion From RiversHaveWings. Generate vibrant and detailed images using only text. See captions and more generations in the Gallery See also – VQGAN-CLIP This code is currently under active development and is subject to frequent changes. Please file an issue if you have any constructive feedback, questions, or issues with the code or colab notebook. Windows user? Please file an issue if you have any issues with the code. I have no way to test that platform currently […]

Read more

Text to Image Generation with Semantic-Spatial Aware GAN in python

text2image This repository includes the implementation for Text to Image Generation with Semantic-Spatial Aware GAN This repo is not completely. Network Structure The structure of the spatial-semantic aware convolutional network (SSACN) is shown as below Requirements python 3.6+ pytorch 1.0+ numpy matplotlib opencv Or install full requirements by running: pip install -r requirements.txt TODO [x] instruction to prepare dataset [ ] remove all unnecessary files [x] add link to download our pre-trained model [ ] clean code including comments [ […]

Read more

A Novel Dataset and A Text-Specific Refinement Approach

Rethinking Text Segmentation This is the repo to host the dataset TextSeg and code for TexRNet from the following paper: Xingqian Xu, Zhifei Zhang, Zhaowen Wang, Brian Price, Zhonghao Wang and Humphrey Shi, Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach, ArXiv Link Note: [2021.04.21] So far, our dataset is partially released with images and semantic labels. Since many people may request the dataset for OCR or non-segmentation tasks, please stay tuned, and we will release the […]

Read more