A Guide to Getting Datasets for Machine Learning in Python
Compared to other programming exercises, a machine learning project is a blend of code and data. You need both to achieve the result and do something useful. Over the years, many well-known datasets have been created, and many have become standards or benchmarks. In this tutorial, we are going to see how we can obtain those well-known public datasets easily. We will also learn how to make a synthetic dataset if none of the existing datasets fits our needs.
After finishing this tutorial, you will know:
- Where to look for freely available datasets for machine learning projects
- How to download datasets using libraries in Python
- How to generate synthetic datasets using scikit-learn
Kick-start your project with my new book