MUGE Text To Image Generation Baseline

Requirements and Installation

More details see fairseq. Briefly,

  • python == 3.6.4
  • pytorch == 1.7.1
  1. Installing fairseq and other requirements

git clone https://github.com/MUGE-2021/image-caption-baseline
cd muge_baseline/
pip install -r requirements.txt
cd fairseq/
pip install --editable .
  1. Downloading data and place to dataset/ directory,
    file structure is

text2image-baseline
    - dataset
        - ECommerce-T2I
            - T2I_train.img.tsv
            - T2I_train.text.tsv
            - ...

Getting Started

The model is a BART-like model with vqgan as a image tokenizer, please see models/t2i_baseline.py for detailed model structure.

Training