Open-AI’s DALL-E for large scale training in mesh-tensorflow
Open-AI’s DALL-E in Mesh-Tensorflow. If this is similarly efficient to GPT-Neo, this repo should be able to train models up to, and larger than, the size of Open-AI’s DALL-E (12B params). No pretrained models… Yet. Thanks to Ben Wang for the tf vae implementation as well as getting the mtf version working, and Aran Komatsuzaki for help building the mtf VAE and input pipeline. git clone https://github.com/EleutherAI/GPTNeo cd GPTNeo pip3 install -r requirements.txt Training Setup Runs on TPUs, untested on […]
Read more