Common Pitfalls In Machine Learning Projects
Last Updated on June 7, 2016
In a recent presentation, Ben Hamner described the common pitfalls in machine learning projects he and his colleagues have observed during competitions on Kaggle.
The talk was titled “Machine Learning Gremlins” and was presented in February 2014 at Strata.
In this post we take a look at the pitfalls from Ben’s talk, what they look like and how to avoid them.
Machine Learning Process
Early in the talk, Ben presented a snap-shot of the process for working a machine learning problem end-to-end.
This snapshot included 9 steps, as follows:
- Start with a business problem
- Source data
- Split data
- Select an evaluation metric
- Perform feature extraction
- Model Training
- Feature Selection
- Model Selection
- Production System
He commented that the process is iterative rather than linear.
He also commented that each step in this process can go wrong, derailing the whole project.
Discriminating Dogs and Cats
Ben presented a case study problem for building an automatic cat
To finish reading, please visit source site