Efficient Vision Transformers with Dynamic Token Sparsification
DynamicViT This repository contains PyTorch implementation for DynamicViT. Created by Yongming Rao, Wenliang Zhao, Benlin Liu, Jiwen Lu, Jie Zhou, Cho-Jui Hsieh Model Zoo We provide our DynamicViT models pretrained on ImageNet: Usage Requirements torch>=1.7.0 torchvision>=0.8.1 timm==0.4.5 Data preparation: download and extract ImageNet images from http://image-net.org/. The directory structure should be │ILSVRC2012/ ├──train/ │ ├── n01440764 │ │ ├── n01440764_10026.JPEG │ │ ├── n01440764_10027.JPEG │ │ ├── …… │ ├── …… ├──val/ │ ├── n01440764 │ │ ├── ILSVRC2012_val_00000293.JPEG │ […]
Read more