A Gentle Introduction to Mini-Batch Gradient Descent and How to Configure Batch Size
Last Updated on August 19, 2019
Stochastic gradient descent is the dominant method used to train deep learning models.
There are three main variants of gradient descent and it can be confusing which one to use.
In this post, you will discover the one type of gradient descent you should use in general and how to configure it.
After completing this post, you will know:
- What gradient descent is and how it works from a high level.
- What batch, stochastic, and mini-batch gradient descent are and the benefits and limitations of each method.
- That mini-batch gradient descent is the go-to method and how to configure it on your applications.
Kick-start your project with my new book Deep Learning With Python, including step-by-step tutorials and the Python source code files for all examples.
Let’s get started.
- Update Apr/2018: Added additional reference to support a batch size of 32.
- Update Jun/2019: Removed mention of average gradient.