MMNas: Deep Multimodal Neural Architecture Search

MMNas

MMNas: Deep Multimodal Neural Architecture Search

This repository corresponds to the PyTorch implementation of the MMnas for visual question answering (VQA), visual grounding (VGD), and image-text matching (ITM) tasks.

Prerequisites

Software and Hardware Requirements

You may need a machine with at least 4 GPU (>= 8GB), 50GB memory for VQA and VGD and 150GB for ITM and 50GB free disk space. We strongly recommend to use a SSD drive to guarantee high-speed I/O.

You should first install some necessary packages.

  1. Install Python >= 3.6

  2. Install Cuda >= 9.0 and cuDNN

  3. Install PyTorch >= 0.4.1 with CUDA (Pytorch 1.x is also supported).

  4. Install SpaCy and initialize the GloVe as follows:

     

     

     

    To finish reading, please visit source site