Pyramid Vision Transformer With Python
- (2020/06/21) Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.
The image is from Transformers: Revenge of the Fallen.
This repository contains the official implementation of PVTv1 & PVTv2 in image classification, object detection, and semantic segmentation tasks.
Model Zoo
Image Classification
Classification configs & weights see >>>here<<<.
Method | Size | [email protected] | #Params (M) |
---|---|---|---|
PVTv2-B0 | 224 | 70.5 | 3.7 |
PVTv2-B1 | 224 | 78.7 | 14.0 |
PVTv2-B2-Linear | 224 | 82.1 | 22.6 |
PVTv2-B2 | 224 | 82.0 | 25.4 |
PVTv2-B3 |
|