Open-AI’s DALL-E for large scale training in mesh-tensorflow
Open-AI’s DALL-E in Mesh-Tensorflow.
If this is similarly efficient to GPT-Neo, this repo should be able to train models up to, and larger than, the size of Open-AI’s DALL-E (12B params).
No pretrained models… Yet.
Thanks to Ben Wang for the tf vae implementation as well as getting the mtf version working, and Aran Komatsuzaki for help building the mtf VAE and input pipeline.
git clone https://github.com/EleutherAI/GPTNeo
cd GPTNeo
pip3 install -r requirements.txt
Training Setup
Runs on TPUs, untested on GPUs but should work in theory. The example configs are designed to run on a TPU v3-32 pod.
To set up TPUs, sign up for Google Cloud Platform, and create a