ETL flow framework based on Yaml configs in Python
FlowMaster
A light framework for creating data streams. Setting up streams through configuration in the Yaml file. There is a schedule, task pools, concurrency limitation. Works quickly, does not require a lot of resources. Runs on Windows and Linux. Flow run in parallel via threading library. Internally SQLite Database.
At the moment there are connectors to sources
- CSV file
- SQLite database
- Yandex Metrika Management API
- Yandex Metrika Stats API
- Yandex Metrika Logs API
- Yandex Direct API
- Yandex Direct Report API
Storages
- Save to csv file
- Clickhouse
Requirements
- python >=3.9
- virtual environment
Settings
It is highly recommended to install in a virtual environment.
Flowmaster needs a home, ‘{HOME}/FlowMaster’ is the default,
but you can lay foundation somewhere else if you prefer
(optional)
For Windows
setx FLOWMASTER_HOME "{YOUR_PATH}"