Best Practices for Preparing and Augmenting Image Data for CNNs
Last Updated on July 5, 2019
It is challenging to know how to best prepare image data when training a convolutional neural network.
This involves both scaling the pixel values and use of image data augmentation techniques during both the training and evaluation of the model.
Instead of testing a wide range of options, a useful shortcut is to consider the types of data preparation, train-time augmentation, and test-time augmentation used by state-of-the-art models that notably achieve the best performance on a challenging computer vision dataset, namely the Large Scale Visual Recognition Challenge, or ILSVRC, that uses the ImageNet dataset.
In this tutorial, you will discover best practices for preparing and augmenting photographs for image classification tasks with convolutional neural networks.
After completing this tutorial, you will know:
- Image data should probably be centered by subtracting the per-channel mean pixel values calculated on the training dataset.
- Training data augmentation should probably involve random rescaling, horizontal flips, perturbations to brightness, contrast, and color, as well as random cropping.
- Test-time augmentation should probably involve both a mixture of multiple rescaling of each image as well as predictions for multiple different systematic crops of each rescaled version of the image.