Revitalize Region Feature for Democratizing Video-Language Pre-training

Revitalize Region Feature for Democratizing Video-Language Pre-training
Guanyu Cai, Yixiao Ge, Alex Jinpeng Wang, Rui Yan, Xudong Lin, Ying Shan, Lianghua He, Xiaohu Qie, Jianping Wu, Mike Zheng Shou [Arxiv]
Pytorch implementation of our method for video-language pre-training.
Requirement
conda create -n demovlp python=3.8
source activate demovlp
pip install -r requirements
Pre-trained weights
Model | Dataset | Download |
---|---|---|
DemoVLP | WebVid+CC3M | Model |
DemoVLP | WebVid+CC3M+CC7M | Model |
Data
Download Pre-trained model
mkdir pretrained
cd pretrained
wget -c https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_224-80ecf9dd.pth