diff --git a/cerebrium/environments/initial-setup.mdx b/cerebrium/environments/initial-setup.mdx index 75ba567b..ae2704c6 100644 --- a/cerebrium/environments/initial-setup.mdx +++ b/cerebrium/environments/initial-setup.mdx @@ -35,22 +35,24 @@ Your config file can be named anything you want and can be placed anywhere on yo The parameters for your config file are the same as those which you would use as flags for a normal `cerebrium deploy` command. They're tabulated below for your convenience: -| Parameter | Description | Type | Default | -| ------------------- | ----------------------------------------------------------------------------------------------- | ------- | ------------------------------------------------------------------ | -| `name` | Name of the deployment | string | | -| `api_key` | API key for the deployment | string | not included for safety | -| `hardware` | Hardware to use for the deployment | string | GPU | -| `gpu_count` | The number of GPUs to specify | int | 2 | -| `cpu` | The number of CPU cores to use | int | 2 | -| `memory` | The amount of Memory to use in GB | int | 14.5 | -| `log_level` | Log level for the deployment | string | INFO | -| `include` | Local files to include in the deployment | string | '[./*, main.py, requirements.txt, pkglist.txt, conda_pkglist.txt]' | -| `exclude` | Local Files to exclude from the deployment | string | '[./.*, ./__*]' | -| `disable_animation` | Whether to disable the animation in the logs. | boolean | false | -| `python_version` | The Python version you would like to run | float | 3.9 | -| `min_replicas` | The minimum number of replicas to run. | int | 0 | -| `max_replicas` | The maximum number of replicas to scale to. | int | \*plan limit | -| `cooldown` | The number of seconds to keep your model warm after each request. It resets after every request | int | 60 | +| Parameter | Description | Type | Default | +| ------------------- | ------------------------------------------------------------------------------------------------ | ------- | ------------------------------------------------------------------ | +| `name` | Name of the deployment | string | | +| `api_key` | API key for the deployment | string | not included for safety | +| `hardware` | Hardware to use for the deployment | string | GPU | +| `gpu_count` | The number of GPUs to specify | int | 2 | +| `cpu` | The number of CPU cores to use | int | 2 | +| `memory` | The amount of Memory to use in GB | int | 14.5 | +| `log_level` | Log level for the deployment | string | INFO | +| `include` | Local iles to include in the deployment | string | '[./*, main.py, requirements.txt, pkglist.txt, conda_pkglist.txt]' | +| `exclude` | Local Files to exclude from the deployment | string | '[./.*, ./__*]' | +| `predict_data` | A JSON object for you to test inference on your endpoint when a deployment is made. | json | {} | +| `disable_predict` | Whether to disable testing your predict function with input. [Optional] | boolean | false | +| `disable_animation` | Whether to disable the animation in the logs. | boolean | false | +| `python_version` | The Python version you would like to run | float | 3.9 | +| `min_replicas` | The minimum number of replicas to run. | int | 0 | +| `max_replicas` | The maximum number of replicas to scale to. | int | \*plan limit | +| `cooldown` | The number of seconds to keep your model warm after each request. It resets after every request. | int | 60 | ## Config File Example @@ -67,4 +69,6 @@ disable_animation: false python_version: 3.9 min_replicas: 0 max_replicas: 30 +predict_data: '{"prompt": "Here is some example predict data from your config.yaml which will be used to test your predict function on build."}' +disable_predict: false # You can set this to true if you'd like to override the predict-on-build functionality. ``` diff --git a/cerebrium/getting-started/quickstart.mdx b/cerebrium/getting-started/quickstart.mdx index 172e88af..0ce9f760 100644 --- a/cerebrium/getting-started/quickstart.mdx +++ b/cerebrium/getting-started/quickstart.mdx @@ -32,7 +32,7 @@ def predict(item, run_id, logger): # Do something with parameters from item - return {"key": "value} + return {"key": "value"} ``` The Item class is where you define the parameters your model receives as well as their type. Item needs to inherit from BaseModel which uses Pydantic to validate request schemas. @@ -48,6 +48,9 @@ You need to define a function with the name **predict** which receives 3 params: As long as your **main.py** contains the above you can write any other Python code. Import classes, add other functions etc. +Take note of the parameters you've defined in your `Item` class. +These are the parameters that you will pass to your model when you make an API call to your model endpoint. You can define as many parameters as you like and name them as you see fit. Just make sure to update the `predict_data` in your **config.yaml** so that you can test your model with some sample data. Otherwise, disable testing by setting `disable_predict` to `true`. + ### Deploy model Then navigate to where your model code (specifically your `main.py`) is located and run the following command: