X-HEC MLOps Project

Introduction

Welcome to our MLOps project!

This project has been created as part of an MLOps class.

It aims to demonstrate several tools used in MLOps: serialization of trainings with MLFlow, model deployment with Prefect, model prediction on API with FastAPI, and dockerization of the whole workflow.

For this project, we based ourselves the Abalone age prediction Kaggle contest, where the goal is predict the age of an abalone from physical measurements. The dataset is included in the repository but you can download it on the Kaggle page if needed.

Authors:

Benjamin Cerf (@benjamincerf57)
Matthieu Delsart (@matthieudelsart)
François Lebrun (@FrancoisLbrn)
Augustin de Saint-Affrique (@AdeStAff)

How to use our project

0. Environment Setup

Clone the repository on your local machine

git clone https://github.com/matthieudelsart/xhec-mlops-project-student

Set up environment with conda (recommended)

conda env create --file environment.yml

Activate the environment

conda activate mlops-project

If you ned to change version of a package due to conflict, follow these steps

Change the version in the requirements.in file
Compile the requirements

./install.sh

Update your conda environment

conda env update --file environment.yml --prune

If you ned to makes some changes on our code

You need to install the pre-commit hooks

pre-commit install

1. EDA, Modelling and experiments tracking with MLFLow

Run the whole modelling notebook to create and save experiments using MLflow.

Make sure to be in the notebooks directory in your terminal.

After running the three experiments in the modelling notebook, you can compare them using the mlflow UI by running the following line in the terminal:

(make sure to be in the notebooks directory in your terminal: cd notebook)

mlflow ui --host 0.0.0.0 --port 5002

Then, go to http://localhost:5002

You should then arrive on this UI, on which you can compare the different experiments / models:

2. Create orchestration pipeline and deploy it with Prefect

Follow these steps :

Set an API URL for your local server to make sure that your workflow will be tracked by this specific instance :

prefect config set PREFECT_API_URL=http://0.0.0.0:4200/api

Note that Windows users may prefer to use 127.0.0.1 instead of 0.0.0.0 here and in other steps

Check you have SQLite installed (Prefect backend database system):

sqlite3 --version

Start a local prefect server :

prefect server start --host 0.0.0.0

If you want to reset the database, run :

prefect server database reset

You can visit the UI at http://localhost:4200/dashboard

You can now run in another terminal the following command, at the root of the directory, to schedule regular model retraining:

(Be sure to reactivate your mlops-project environment)

python3 src/modelling/deployment.py

When on http://localhost:4200/deployments, click on train-model to see the scheduled retraining of the model:

You can click on "quick run" to train the model now, then go at the bottom of the page and click on the latest run:

You should then be able to see the training flow and the different tasks within this flow:

3. Deploy an API to predict new observations

Follow these steps:

Go in the web_service folder:

cd src/web_service

Run the app:

uvicorn main:app --reload

Click on the link provided: http://localhost:8000/docs
Click on Try it out:

Fill here the data of your observation:

And then execute.

Your prediction is given just below:

4. Run the whole workflow on Docker

Prerequisites

Make sure you have Docker installed.

To do:

Create the docker image: Go in your terminal and run:

docker build -t project-app -f Dockerfile.app .

Run it on a container:

docker run -p 8000:8000 -p 4200:4200 project-app

Note: if you are in windows, you may need to run the following before building the image: dos2unix ./bin/run_services.sh

You should see this in your terminal:

Click on the links that are provided to you to run the API and use PREFECT from different ports. Enjoy! :)

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
.github/workflows		.github/workflows
assets		assets
bin		bin
data		data
notebooks		notebooks
src		src
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile.app		Dockerfile.app
README.md		README.md
environment.yml		environment.yml
install.sh		install.sh
pyproject.toml		pyproject.toml
requirements-dev.in		requirements-dev.in
requirements-dev.txt		requirements-dev.txt
requirements.in		requirements.in
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

X-HEC MLOps Project

Table of Contents

Introduction

How to use our project

0. Environment Setup

If you ned to change version of a package due to conflict, follow these steps

If you ned to makes some changes on our code

1. EDA, Modelling and experiments tracking with MLFLow

2. Create orchestration pipeline and deploy it with Prefect

3. Deploy an API to predict new observations

4. Run the whole workflow on Docker

Prerequisites

To do:

About

Releases

Packages

Languages

matthieudelsart/mlops-demo-project

Folders and files

Latest commit

History

Repository files navigation

X-HEC MLOps Project

Table of Contents

Introduction

How to use our project

0. Environment Setup

If you ned to change version of a package due to conflict, follow these steps

If you ned to makes some changes on our code

1. EDA, Modelling and experiments tracking with MLFLow

2. Create orchestration pipeline and deploy it with Prefect

3. Deploy an API to predict new observations

4. Run the whole workflow on Docker

Prerequisites

To do:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages