7 Ways to Handle Large Data Files for Machine Learning

Exploring and applying machine learning algorithms to datasets that are too large to fit into memory is pretty common.

This leads to questions like:

In this post, I want to offer some common suggestions you may want to consider.

7 Ways to Handle Large Data Files for Machine Learning
Photo by Gareth Thompson, some rights reserved.

1. Allocate More Memory

Some machine learning tools or libraries may be limited by a default memory configuration.

Check if you can re-configure your tool or library to allocate more memory.

A good example is Weka, where you can increase the memory as a parameter when starting the application.

Are you sure you need to work with all of the data?

Take a random sample of your data, such as the first 1,000 or 100,000 rows.
To finish reading, please visit source site