Loading SQL data into Pandas without running out of memory
You have some data in a relational database, and you want to process it with Pandas. So you use Pandas’ handy read_sql() API to get a DataFrame—and promptly run out of memory. The problem: you’re loading all the data into memory at once. If you have enough rows in the SQL query’s results, it simply won’t fit in RAM. Pandas does have a batching option for read_sql(), which can reduce memory usage, but it’s still not perfect: it also loads […]
Read more