Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added predict documentation but unsure #146

Closed
wants to merge 10 commits into from
Closed
36 changes: 20 additions & 16 deletions cerebrium/environments/initial-setup.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -35,22 +35,24 @@ Your config file can be named anything you want and can be placed anywhere on yo

The parameters for your config file are the same as those which you would use as flags for a normal `cerebrium deploy` command. They're tabulated below for your convenience:

| Parameter | Description | Type | Default |
| ------------------- | ----------------------------------------------------------------------------------------------- | ------- | ------------------------------------------------------------------ |
| `name` | Name of the deployment | string | |
| `api_key` | API key for the deployment | string | not included for safety |
| `hardware` | Hardware to use for the deployment | string | GPU |
| `gpu_count` | The number of GPUs to specify | int | 2 |
| `cpu` | The number of CPU cores to use | int | 2 |
| `memory` | The amount of Memory to use in GB | int | 14.5 |
| `log_level` | Log level for the deployment | string | INFO |
| `include` | Local files to include in the deployment | string | '[./*, main.py, requirements.txt, pkglist.txt, conda_pkglist.txt]' |
| `exclude` | Local Files to exclude from the deployment | string | '[./.*, ./__*]' |
| `disable_animation` | Whether to disable the animation in the logs. | boolean | false |
| `python_version` | The Python version you would like to run | float | 3.9 |
| `min_replicas` | The minimum number of replicas to run. | int | 0 |
| `max_replicas` | The maximum number of replicas to scale to. | int | \*plan limit |
| `cooldown` | The number of seconds to keep your model warm after each request. It resets after every request | int | 60 |
| Parameter | Description | Type | Default |
| ------------------- | ------------------------------------------------------------------------------------------------ | ------- | ------------------------------------------------------------------ |
| `name` | Name of the deployment | string | |
| `api_key` | API key for the deployment | string | not included for safety |
| `hardware` | Hardware to use for the deployment | string | GPU |
| `gpu_count` | The number of GPUs to specify | int | 2 |
| `cpu` | The number of CPU cores to use | int | 2 |
| `memory` | The amount of Memory to use in GB | int | 14.5 |
| `log_level` | Log level for the deployment | string | INFO |
| `include` | Local iles to include in the deployment | string | '[./*, main.py, requirements.txt, pkglist.txt, conda_pkglist.txt]' |
| `exclude` | Local Files to exclude from the deployment | string | '[./.*, ./__*]' |
| `predict_data` | A JSON object for you to test inference on your endpoint when a deployment is made. | json | {} |
| `disable_predict` | Whether to disable testing your predict function with input. [Optional] | boolean | false |
| `disable_animation` | Whether to disable the animation in the logs. | boolean | false |
| `python_version` | The Python version you would like to run | float | 3.9 |
| `min_replicas` | The minimum number of replicas to run. | int | 0 |
| `max_replicas` | The maximum number of replicas to scale to. | int | \*plan limit |
| `cooldown` | The number of seconds to keep your model warm after each request. It resets after every request. | int | 60 |

## Config File Example

Expand All @@ -67,4 +69,6 @@ disable_animation: false
python_version: 3.9
min_replicas: 0
max_replicas: 30
predict_data: '{"prompt": "Here is some example predict data from your config.yaml which will be used to test your predict function on build."}'
disable_predict: false # You can set this to true if you'd like to override the predict-on-build functionality.
```
5 changes: 4 additions & 1 deletion cerebrium/getting-started/quickstart.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ def predict(item, run_id, logger):

# Do something with parameters from item

return {"key": "value}
return {"key": "value"}
```

The Item class is where you define the parameters your model receives as well as their type. Item needs to inherit from BaseModel which uses Pydantic to validate request schemas.
Expand All @@ -48,6 +48,9 @@ You need to define a function with the name **predict** which receives 3 params:

As long as your **main.py** contains the above you can write any other Python code. Import classes, add other functions etc.

Take note of the parameters you've defined in your `Item` class.
These are the parameters that you will pass to your model when you make an API call to your model endpoint. You can define as many parameters as you like and name them as you see fit. Just make sure to update the `predict_data` in your **config.yaml** so that you can test your model with some sample data. Otherwise, disable testing by setting `disable_predict` to `true`.

### Deploy model

Then navigate to where your model code (specifically your `main.py`) is located and run the following command:
Expand Down
Loading