Created covid data pipeline using PySpark and MySQL that collected data stream from API
![](https://www.deeplearningdaily.com/wp-content/uploads/2021/11/created-covid-data-pipeline-using-pyspark-and-mysql-that-collected-data-stream-from-api_61942ca78bd2a-375x210.jpeg)
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.
Tools used : PySpark , MySQL
Procedure
-
Fetch latest data from API using requests & pandas module of python.
-
Apply some data processing and filtering to generate summarized information.
-
Store that summarized information into database using MySQL.
To build above pipeline i had used pyspark
{IMPORTANT}
Before move to the execution part please read below sentences
-
Use correct connector and drivername while making connection with MySQL db if you are going to use