SeqTR: A Simple yet Universal Network for Visual Grounding
This is the official implementation of SeqTR: A Simple yet Universal Network for Visual Grounding, which simplifies and unifies the modelling for visual grounding tasks under a novel point prediction paradigm. Installation Prerequisites pip install -r requirements.txt wget https://github.com/explosion/spacy-models/releases/download/en_vectors_web_lg-2.1.0/en_vectors_web_lg-2.1.0.tar.gz -O en_vectors_web_lg-2.1.0.tar.gz pip install en_vectors_web_lg-2.1.0.tar.gz Then install SeqTR package in editable mode: Data Preparation Download our preprocessed json files including the merged dataset for pre-training, and DarkNet-53 model weights trained on MS-COCO object detection task. Download the train2014 images from mscoco […]
Read more