DALLE-tools provided useful dataset utilities to improve you workflow with WebDatasets
DALLE-tools is a github repository with useful tools to categorize, annotate or check the sanity of your datasets.
Installation
Just clone this repository to your folder and use one of the following commands in the section underneath.
WebDataset Annotator
Press to switch to the next page, to change the annotation category or click on the image to add it to the current cateogry and save it in annotations.json. Please upload your annotations.json by creating a push request into community_annotations folder into the folder of the dataset you used (e.g. YFCC100m, or LAION400m etc.), so everyone can use the data for better dataset annotations!
If you want to continue to annotate a dataset where someone else already started, just copy