Welcome to our MLOps project!
This project has been created as part of an MLOps class.
It aims to demonstrate several tools used in MLOps: serialization of trainings with MLFlow, model deployment with Prefect, model prediction on API with FastAPI, and dockerization of the whole workflow.
For this project, we based ourselves the Abalone age prediction Kaggle contest, where the goal is predict the age of an abalone from physical measurements. The dataset is included in the repository but you can download it on the Kaggle page if needed.
Authors:
- Benjamin Cerf (@benjamincerf57)
- Matthieu Delsart (@matthieudelsart)
- François Lebrun (@FrancoisLbrn)
- Augustin de Saint-Affrique (@AdeStAff)
- Clone the repository on your local machine
git clone https://github.com/matthieudelsart/xhec-mlops-project-student
- Set up environment with conda (recommended)
conda env create --file environment.yml
- Activate the environment
conda activate mlops-project
-
Change the version in the
requirements.in
file -
Compile the requirements
./install.sh
- Update your conda environment
conda env update --file environment.yml --prune
You need to install the pre-commit hooks
pre-commit install
Run the whole modelling notebook to create and save experiments using MLflow.
Make sure to be in the notebooks directory in your terminal.
After running the three experiments in the modelling notebook, you can compare them using the mlflow UI by running the following line in the terminal:
(make sure to be in the notebooks directory in your terminal: cd notebook
)
mlflow ui --host 0.0.0.0 --port 5002
Then, go to http://localhost:5002
You should then arrive on this UI, on which you can compare the different experiments / models:
Follow these steps :
- Set an API URL for your local server to make sure that your workflow will be tracked by this specific instance :
prefect config set PREFECT_API_URL=http://0.0.0.0:4200/api
Note that Windows users may prefer to use
127.0.0.1
instead of0.0.0.0
here and in other steps
- Check you have SQLite installed (Prefect backend database system):
sqlite3 --version
- Start a local prefect server :
prefect server start --host 0.0.0.0
If you want to reset the database, run :
prefect server database reset
You can visit the UI at http://localhost:4200/dashboard
-
You can now run in another terminal the following command, at the root of the directory, to schedule regular model retraining:
(Be sure to reactivate your
mlops-project
environment)
python3 src/modelling/deployment.py
- When on http://localhost:4200/deployments, click on train-model to see the scheduled retraining of the model:
- You can click on "quick run" to train the model now, then go at the bottom of the page and click on the latest run:
- You should then be able to see the training flow and the different tasks within this flow:
Follow these steps:
- Go in the web_service folder:
cd src/web_service
- Run the app:
uvicorn main:app --reload
-
Click on the link provided: http://localhost:8000/docs
-
Click on Try it out:
- Fill here the data of your observation:
And then execute.
- Your prediction is given just below:
Make sure you have Docker installed.
- Create the docker image: Go in your terminal and run:
docker build -t project-app -f Dockerfile.app .
- Run it on a container:
docker run -p 8000:8000 -p 4200:4200 project-app
Note: if you are in windows, you may need to run the following before building the image:
dos2unix ./bin/run_services.sh
- You should see this in your terminal:
- Click on the links that are provided to you to run the API and use PREFECT from different ports. Enjoy! :)