Searching for Efficient Multi-Stage Vision Transformers in Pytorch
This repository contains the official Pytorch implementation of “Searching for Efficient Multi-Stage Vision Transformers” and is based on DeiT and timm. Illustration of the proposed multi-stage ViT-Res network. Illustration of weight-sharing neural architecture search with multi-architectural sampling. Accuracy-MACs trade-offs of the proposed ViT-ResNAS. Our networks achieves comparable results to previous work. Requirements The codebase is tested with 8 V100 (16GB) GPUs. To install requirements: pip install -r requirements.txt Docker files are provided to set up the environment. Please run: cd […]
Read more