How Much Training Data is Required for Machine Learning?
Last Updated on May 23, 2019
The amount of data you need depends both on the complexity of your problem and on the complexity of your chosen algorithm.
This is a fact, but does not help you if you are at the pointy end of a machine learning project.
A common question I get asked is:
How much data do I need?
I cannot answer this question directly for you, or for anyone. But I can give you a handful of ways of thinking about this question.
In this post, I lay out a suite of methods that you can use to think about how much training data you need to apply machine learning to your problem.
My hope that one or more of these methods may help you understand the difficulty of the question and how it is tightly coupled with the heart of the induction problem that you are trying to solve.
Let’s dive into it.
Note: Do you have your own heuristic methods for deciding how much data is required for machine learning? Please share them in the comments.