MLOps Lifecycle with MLFlow, Airflow, Amazon S3, PostgreSQL

The project is available on my GitHub

MLOps Pipeline with MLflow + PostgreSQL + Amazon S3 + Apache Airflow

A complete Machine Learning lifecycle. The pipeline is as follows:

1. Read Data2. Split train-test3. Preprocess Data4. Train Model
      ➙ 5.1 Register Model
      ➙ 5.2 Update Registered Model

Telco Customer Churn dataset from Kaggle.

Tech Stack

MLflow: For experiment tracking and model registration
PostgreSQL: Store the MLflow tracking
Amazon S3: Store the registered MLflow models and artifacts
Apache Airflow: Orchestrate the MLOps pipeline
Scikit-learn: Machine Learning
Jupyter: R&D
Python Anaconda PyCharm Docker Git

How to reproduce

  1. Have Docker installed and running.

Make sure docker-compose is installed:

pip install docker-compose
  1. Clone the repository to your machine.
    git clone https://github.com/Deffro/MLOps.git
    
  2. Rename .env_sample to .env and change the following variables:
    • AWS_ACCESS_KEY_ID
    • AWS_SECRET_ACCESS_KEY
    • AWS_REGION
    • AWS_BUCKET_NAME
  3. Run the docker-compose file
docker-compose up --build -d

Urls to access

Cleanup

Run the following to stop all running docker containers through docker compose

docker-compose stop

or run the following to stop and delete all running docker containers through docker

docker stop $(docker ps -q)
docker rm $(docker ps -aq)

Finally, run the following to delete all (named) volumes

docker volume rm $(docker volume ls -q)

Leave a Comment