Connecting Language and Vision for Natural Language-Based Vehicle Retrieval
AI City 2021
The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)
We have two codebases. For the final submission, we conduct the feature ensemble, where features are from two codebases.
Part One is at here: https://github.com/ShuaiBai623/AIC2021-T5-CLV
Part Two is at here: https://github.com/layumi/NLP-AICity2021
Prepare
- Preprocess the dataset to prepare
frames, motion maps, NLP augmentation
scripts/extract_vdo_frms.py
is a Python script that is used to extract frames.
scripts/get_motion_maps.py
is a Python script that is used to get motion maps.
scripts/deal_nlpaug.py
is a Python script that is used for NLP augmentation.
- Download the pretrained models of Part One to
checkpoints
. The checkpoints can be found here. The best score of a single model on TestA is 0.1927 frommotion_effb3_NOCLS_nlpaug_320.pth
.
The directory structures