Quickly download, clean up, and install public datasets into a database management system
retriever
Finding data is one thing. Getting it ready for analysis is another. Acquiring, cleaning, standardizing and importing publicly available data is time consuming because many datasets lack machine readable metadata and do not conform to established data structures and formats. The Data Retriever automates the first steps in the data analysis pipeline by downloading, cleaning, and standardizing datasets, and importing them into relational databases, flat files, or programming languages. The automation of this process reduces the time for a user to get most large datasets up and running by hours, and in some cases days.
Installing the Current Release
If you have Python installed you can install the current release using either pip
:
pip install retriever
or conda
after adding the conda-forge
channel (conda config