Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval
LightningDOT
This repository contains source code and pre-trained/fine-tuned checkpoints for NAACL 2021 paper “LightningDOT”. It currently supports fine-tuning on MSCOCO and Flickr30k. Pre-training code and a demo for FULL MSCOCO retrieval are also available.
Some code in this repo is copied/modifed from UNITER and DPR.
If you find the code useful for your research, please consider citing:
@inproceedings{sun2021lightningdot,
title={LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval},
author={Sun, Siqi and Chen, Yen-Chun and Li, Linjie and Wang, Shuohang and Fang, Yuwei and Liu, Jingjing},
booktitle={NAACL-HLT},
year={2021}
}
UNITER Environment
To run UNITER for re-ranking, please set a seperate environment based on this repo.
The rest of code is using a conda environment that can be created as follows.
Environment