-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
22 changed files
with
82 additions
and
80 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,7 +4,7 @@ description: "A list of hardware that is available on Cerebrium's platform." | |
--- | ||
|
||
The Cerebrium platform allows you to quickly and easily deploy machine learning workloads on a variety of different hardware. | ||
We take care of all the hard work so you don't have to. Everything from the hardware drivers to the scaling of your deployments is managed by us so that you can focus on what matters most: your use case. | ||
We take care of all the hard work so you don't have to. We manage everything from the hardware drivers to the scaling of your deployments so that you can focus on what matters most: your use case. | ||
|
||
This page lists the hardware that is currently available on the platform. If you would like us to support additional hardware options on the platform please reach out to [Support](mailto:[email protected]) | ||
|
||
|
@@ -30,7 +30,7 @@ These GPUs can be selected using the `--hardware` flag when deploying your model | |
For more help with deciding which GPU you require, see this section [here](#choosing-a-gpu). | ||
|
||
_Due to the global shortage of GPUs at the moment, we may not always have the Enterprise edition of your GPU available. In this case, we will deploy to the Workstation edition of the GPU._ | ||
_These are the same GPUs and it will not affect the performance of your model in any way._ | ||
_These are the same GPUs, and it will not affect the performance of your model in any way._ | ||
|
||
## CPUs | ||
|
||
|
@@ -47,7 +47,7 @@ Once again, you only pay for what you need! | |
|
||
## Storage | ||
|
||
We provide you with a persistent storage volume that is attached to your deployment. | ||
We provide you with a persistent storage volume attached to your deployment. | ||
You can use this storage volume to store any data that you need to persist between deployments. Accessing your persistent storage is covered in depth for [cortex here](./cerebrium/data-sharing-storage/persistent-storage). | ||
|
||
The storage volume is backed by high-performance SSDs so that you can get the best performance possible | ||
|
@@ -56,7 +56,7 @@ Pricing for storage is based on the amount of storage you use and is charged per | |
# Determine your Hardware Requirements | ||
|
||
Deciding which hardware you require for your deployment can be a daunting task. | ||
On one hand, you want the best performance possible but on the other hand, you don't want to pay for more resources than you need. | ||
On one hand, you want the best performance possible, but on the other hand, you don't want to pay for more resources than you need. | ||
|
||
## Choosing a GPU | ||
|
||
|
@@ -71,20 +71,20 @@ You can calculate the VRAM usage of your model by using the following formula: | |
modelVRAM = numParams x numBytesPerDataType | ||
``` | ||
|
||
For example, if you have a model that is 7B parameters and you decide to use 32-bit Floating point precision, you can calculate the VRAM usage as follows: | ||
For example, if you have a model that is 7B parameters, and you decide to use 32-bit Floating point precision, you can calculate the VRAM usage as follows: | ||
|
||
```python | ||
modelVRAM = 7B x 4 = 28GB | ||
``` | ||
|
||
When you include the 1.5x multiplier from our rule of thumb, this means that you should choose a GPU with at least ~40GB of VRAM to ensure that your model will fit on the hardware you choose. | ||
|
||
Alternatively, if you were happy with the slight precision penalty of using quantisation, your model would have required 7GB of VRAM for 8-bit quantisation. So you could have chosen a GPU with 16GB of VRAM. This is the approach we recommend especially with large models (>20B parameters) as the precision penalty is minimal and your cost savings are substantial. | ||
Alternatively, if you were happy with the slight precision penalty of using quantisation, your model would have required 7GB of VRAM for 8-bit quantisation. So you could have chosen a GPU with 16GB of VRAM. This is the approach we recommend, especially with large models (>20B parameters) as the precision penalty is minimal and your cost savings are substantial. | ||
|
||
<Note> | ||
Pro tip: The precision loss from quantisation is negligible in comparison to | ||
the performance gains you get from the larger model that can fit on the same | ||
hardware. | ||
Pro tip: The precision loss from quantisation is negligible in comparison to | ||
the performance gains you get from the larger model that can fit on the same | ||
hardware. | ||
</Note> | ||
|
||
## Setting your number of CPU Cores | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -33,7 +33,7 @@ You can send us feedback requests at [[email protected]](mailto:support@cereb | |
- Define pip/conda container environments in code | ||
- Secrets manager | ||
- One-click deploys | ||
- Persistant Storage | ||
- Persistent Storage | ||
|
||
<b>All of this in just a few lines of code!</b> | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.