Skip to content

slaclab/test-template

Repository files navigation

lcls-cu-inj-model-deployment-test

This is the main deployment repository for the lcls-cu-inj-model-deployment-test of the lcls-cu-inj-model model using LUME-Model in an online environment.

Overview

This repository was created from the lume-model-deployment-template using copier. It provides a structured approach to deploying machine learning models using LUME-Model in an online environment. It offers a reproducible method for containerizing and deploying ML models, ensuring consistency and ease of use across different projects, while minimizing boilerplate code for deployment.

Please refer to the original template repository for detailed documentation on its features and usage. If the template is updated and you want to apply changes to your project:

copier update

This will re-apply the template, preserving your answers and customizations where possible.

How to Use

0. Register Your Model and Prepare PV Mapping

Before using this template, ensure you have registered your model in MLflow and prepared the PV mapping for your deployment. Please see the original template repository's README for instructions on how to do this.

Important

There is currently no validation implemented for this; user must ensure config matches the LUME-model and that the mapping is defined correctly.

1. Create a Deployment Repository

In a Python environment with copier installed, run:

mkdir lume-example-deployment
copier copy gh:slaclab/lume-model-deployment-template lume-example-deployment
cd lume-example-deployment

2. Add Your PV Mapping

Copy the pv_mapping.yaml file that you created in step 0 with your PV names and how they map to the model features to the src/online_model/configs/ directory. If you want output PVs to be written back to EPICS, ensure that the output PVs are included in this mapping as well.

3. Initialize and Push to GitHub

Create a new repository on GitHub under the slaclab org (e.g., lume-example-deployment). Note that the repository must be public, otherwise, additional configuration is needed (configuring tokens/authorizations).

Then run:

git init
git add -A
git commit -m "init commit"
git remote add origin https://github.com/slaclab/lume-example-deployment.git
git push --set-upstream origin main

Once pushed, this will automatically trigger a GitHub Actions workflow to build and push the Docker image to the GitHub Container Registry under slaclab. Once that's done, you can deploy to your target Kubernetes cluster. If ArgoCD is already set up, it will automatically deploy the new image.

Caution

The image name is set as the registered model name. If the model has already been deployed before from another repository under the same name, the image will already exist under the GitHub Container Registry for slaclab/<other_deployment_name>. In this case, you will need to either delete the existing image from the registry (after making sure it is no longer being used) or change the registered name to a new unique name (e.g., if it's a new type of deployment).

4. Check Your Deployment

To check you deployment, you can go to your MLflow experiment to see if the run is active with no errors, and you can plot the input/output variables under run -> model metrics.

If you have access to the vcluster, you can use kubectl to check the pod name and then print the logs:

kubectl get pods
kubectl logs -f <pod-name>

Optional Local Testing

If you want to test the deployment locally before pushing to GitHub, you can build and run the Docker image locally. Make sure you either are on SLAC network or have access to the MLflow server you are using, and that you have Docker installed.

Important

The "test" interface does not connect to EPICS and does not use the pv_mapping or any EPICS related transforms or config. It does not test any I/O interface, but instead generates random values for each input variable from their specified ranges. Therefore, the output values are not meaningful, but this allows you to test the most of the inference run.

To build and run the Docker image locally with the test interface, run:

docker build --build-arg INTERFACE=test -t lume:test .
docker run -e INTERFACE=test -e MODEL_VERSION=1 lume:test

You can also access the container's shell with:

docker run -it lume:test bash

If you want to test with a local MLflow server, you can run in your terminal with MLflow installed and use whatever port is available (e.g., 8082) (see docs):

mlflow server --host 127.0.0.1 --port 8082 --gunicorn-opts --timeout=60

Then edit the template_config to set mlflow_tracking_uri="http://127.0.0.1:8082".


For more details, see the lume-model documentation and the Copier documentation.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages