-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
27 changed files
with
307 additions
and
451 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,10 +3,10 @@ title: "Available Hardware" | |
description: "A list of hardware that is available on Cerebrium's platform." | ||
--- | ||
|
||
The Cerebrium platform allows you to quickly and easily fine-tune and deploy machine learning workloads on a variety of different hardware. | ||
The Cerebrium platform allows you to quickly and easily deploy machine learning workloads on a variety of different hardware. | ||
We take care of all the hard work so you don't have to. Everything from the hardware drivers to the scaling of your deployments is managed by us so that you can focus on what matters most: your use case. | ||
|
||
This page lists the hardware that is currently available on the platform. | ||
This page lists the hardware that is currently available on the platform. If you would like us to support additional chips on the platform please reach out to [Support](mailto:[email protected]) | ||
|
||
# Hardware | ||
|
||
|
@@ -26,7 +26,7 @@ We have the following graphics cards available on the platform: | |
|
||
_NOTE: The maximum model sizes are calculated as a guideline, assuming that the model is the only thing loaded into VRAM. Longer inputs will result in a smaller maximum model size. Your mileage may vary._ | ||
|
||
These GPUs can be selected using the `--hardware` flag when deploying your model on Cortex. | ||
These GPUs can be selected using the `--hardware` flag when deploying your model on Cortex or can be specified in your config.yaml. | ||
For more help with deciding which GPU you require, see this section [here](#choosing-a-gpu). | ||
|
||
_Due to the global shortage of GPUs at the moment, we may not always have the Enterprise edition of your GPU available. In this case, we will deploy to the Workstation edition of the GPU._ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,16 +10,16 @@ view the pricing of various compute on our [pricing page](https://www.cerebrium. | |
When you deploy a model, there are two processes we charge you for: | ||
|
||
1. We charge you for the build process in order to setup your model environment. This means we need to download your Python version, download and install Linux packages, Python packages | ||
and any model files you require. We only charge you for a model build once until you change your environment ie: add more packages - then we will charge you again since | ||
we rebuild the environment. This process happens every time you do 'cerebrium deploy'. | ||
2. The model runtime. This is the amount of time it takes your code to run from start to finish on each request. There are two 3 costs to consider here: | ||
and any model files you require. We only charge you for a model build once until you change your environment ie: add more packages or specify different versions - then we will charge you again since | ||
we rebuild the environment. This process happens every time you do 'cerebrium deploy'. However, we do a lot of caching so each subsequent build should be a lot less than the first. | ||
2. The model runtime. This is the amount of time it takes your code to run from start to finish on each request. There are 3 costs to consider here: | ||
|
||
- <u>Cold start</u>: This is the amount of time it takes to spin up a server(s), | ||
load your environment, connect storage etc. This is part of the Cerebrium | ||
service and something we are working on every day to get as low as possible. <b>We do not charge you for this</b> | ||
service and something we are working on every day to get as low as possible. <b>We do not charge you for this!</b> | ||
- <u>Model initialization</u>: This part of your code is outside of the predict | ||
function and only runs when your model incurs a cold start. You are charged | ||
for the amount of time it takes for this code to run. | ||
for the amount of time it takes for this code to run. Typically this is loading a model into GPU RAM. | ||
- <u>Predict runtime</u>: This is the code stored in your predict function and | ||
runs every time a request hits your endpoint | ||
|
||
|
@@ -70,6 +70,6 @@ Total cost: $121.567/month | |
``` | ||
|
||
Now you would expect the cost to grow linearly with the number of requests you get however, we want to help you grow your business and so we are happy to discount your computing cost as you grow. | ||
To see the discounts you can get for your volume, check out our pricing page or contact us directly. | ||
To see the discounts you can get for your volume, please contact us directly on [Sales](mailto:[email protected]) | ||
|
||
As you grow, don't worry about scaling your deployment with demand, sourcing new hardware or managing your infrastructure. You can focus on your code and we'll handle the rest. |
48 changes: 0 additions & 48 deletions
48
cerebrium/cortex/advanced-functionality/init-cortex-project.mdx
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.