This folder contains the resources required for deploying the trained model onto Highwind.
All commands below are run from this directory.
This step builds the Kserve predictor image that contains your model.
-
First, make sure you have the trained model and tokenizer locally available. To get these, you can do one of the following:
- Download the files from the Hugging Face model repo and save them to
saved_model
(in the root directory of this example) - Run the inference notebooks located in the
notebooks
directory.
- Download the files from the Hugging Face model repo and save them to
-
Copy over the trained model and its definition code into this folder so that it can be baked into the Docker container for serving
cp -r ../saved_model .
-
Then build the container locally and give it a tag
docker build -t local/hw-examples/helsinki-nlp-opus-mt-en-fr:latest .
-
After building the Kserve predictor image that contains your model, spin it up to test your model inference
docker compose up -d docker compose logs
-
Finally, send a payload to your model to test its response. To do this, use the
curl
cmmand to send aPOST
request with an example JSON payload.Run this from another terminal (remember to navigate to this folder first)
Linux/Mac Bash/zsh
curl -X POST http://localhost:8080/v2/models/model/infer -H 'Content-Type: application/json' -d @./input.json
Windows PowerShell
$json = Get-Content -Raw -Path ./input.json $response = Invoke-WebRequest -Uri http://localhost:8080/v2/models/model/infer -Method Post -ContentType 'application/json' -Body ([System.Text.Encoding]::UTF8.GetBytes($json)) $responseObject = $response.Content | ConvertFrom-Json $responseObject | ConvertTo-Json -Depth 10