-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs(dask): add initial dask docs for users and admins (#212)
- Loading branch information
Showing
6 changed files
with
178 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
87 changes: 87 additions & 0 deletions
87
docs/administration/configuration/configuring-dask/index.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
# Configuring Dask | ||
|
||
Dask integration in REANA allows users to request dedicated Dask clusters for their workflow requirements. Each cluster operates independently, providing the computational resources necessary to efficiently execute workflows. | ||
|
||
## Enabling Dask | ||
|
||
Dask support is disabled by default in REANA. If you would like to enable them so that users can ask for a Dask cluster for their workflow, you can set [`dask.enabled`](https://github.com/reanahub/reana/tree/master/helm/reana) Helm value to `true`. | ||
|
||
## Configuring Autoscaler | ||
|
||
Each Dask cluster in REANA comes with an autoscaler by default. If you would like to disable autoscaler feature, you can set the [`dask.autoscaler_enabled`](https://github.com/reanahub/reana/tree/master/helm/reana) Helm value to `false`. | ||
|
||
The autoscaler manages the Dask cluster for your workflow by scaling up to a maximum of N workers when needed and scaling down during less resource-intensive periods. You can define the number of workers (N) in your `reana.yaml` file or use the default worker count set for Dask clusters. | ||
|
||
For more details on how the autoscaler works under the hood, you can check the [official Dask Kubernetes Operator autoscaler documentation](https://kubernetes.dask.org/en/latest/operator_resources.html#daskautoscaler). | ||
|
||
## Limiting Cluster Memory | ||
|
||
The maximum memory allocated for a Dask cluster can be configured using the [dask.cluster_max_memory_limit](https://github.com/reanahub/reana/tree/master/helm/reana) Helm value which is set to `16Gi` by default. This setting defines the upper memory limit that can be requested for a cluster by users, based on the combined memory usage of all workers. | ||
|
||
For instance, if the `dask.cluster_max_memory_limit` is set to 9Gi, a user can request a cluster with 3 workers, each utilizing up to 3Gi of memory. Any configuration exceeding this limit (e.g., 5 workers with 2Gi each, totaling 10Gi) will not be permitted. | ||
|
||
## Configuring Default and Maximum Number Of Workers | ||
|
||
When configuring Dask clusters in REANA, there are two important Helm values to control the number of workers in a cluster: | ||
|
||
### [`dask.cluster_default_number_of_workers`](https://github.com/reanahub/reana/tree/master/helm/reana) | ||
|
||
This value determines the default number of workers assigned to a Dask cluster if the user does not explicitly specify the `number_of_workers` field in their `reana.yaml` workflow configuration file. Setting this value ensures that all Dask clusters start with a reasonable number of workers to handle typical workloads without requiring user input. | ||
|
||
For example: | ||
|
||
```yaml | ||
dask: | ||
cluster_default_number_of_workers: 3 | ||
``` | ||
In this case, every Dask cluster will start with 3 workers unless a different number is provided in the reana.yaml | ||
If the cluster administrator does not overwrite the `dask.cluster_default_number_of_workers` variable, it is set to `2` by default. | ||
|
||
### [`dask.cluster_max_number_of_workers`](https://github.com/reanahub/reana/tree/master/helm/reana) | ||
|
||
This value defines the upper limit on the number of workers a user can request in their reana.yaml, even if their workflow does not reach the cluster_memory_limit. It acts as a safeguard to prevent users from requesting an excessive number of workers with very low memory allocations (e.g., 100 workers with only 30Mi memory). | ||
|
||
```yaml | ||
dask: | ||
cluster_max_number_of_workers: 50 | ||
``` | ||
|
||
In this case, users can request up to 50 workers in their reana.yaml. If they attempt to request more than the maximum limit, the system will cap the cluster size to 50 workers, regardless of the memory limits. | ||
|
||
If the cluster administrator does not overwrite the `dask.cluster_max_number_of_workers` variable, it is set to `20` by default. | ||
|
||
## Configuring Default and Maximum Memory for Single Workers | ||
|
||
In addition to managing the number of workers in a Dask cluster, it is crucial to configure memory limits for individual workers to ensure resource efficiency and prevent workloads from exceeding the capacity of your cluster nodes. REANA provides two Helm values for controlling the default and maximum memory allocation per worker: | ||
|
||
### [`dask.cluster_default_single_worker_memory`](https://github.com/reanahub/reana/tree/master/helm/reana) | ||
|
||
```yaml | ||
dask: | ||
cluster_default_single_worker_memory: "2Gi" | ||
``` | ||
|
||
This value sets the default memory allocated to a single worker in a Dask cluster if the user does not specify the `single_worker_memory` field in their `reana.yaml` workflow configuration file. | ||
|
||
For example: | ||
|
||
In this case, if the user does not explicitly set a memory limit for their workers, each worker in the Dask cluster will be allocated 2Gi of memory by default. This ensures a predictable baseline configuration for workflows. | ||
|
||
If the cluster administrator does not overwrite the `dask.cluster_default_single_worker_memory` variable, it is set to `2Gi` by default. | ||
|
||
### [`dask.cluster_max_single_worker_memory`](https://github.com/reanahub/reana/tree/master/helm/reana) | ||
|
||
This value defines the upper memory limit that a user can request for a single worker in their reana.yaml. It acts as a safeguard to prevent users from allocating more memory than the underlying Kubernetes nodes can handle, which could lead to scheduling failures. | ||
|
||
For example: | ||
|
||
```yaml | ||
dask: | ||
cluster_max_single_worker_memory: "15Gi" | ||
``` | ||
|
||
In this case, users can request up to 15Gi of memory for each worker. Any request exceeding this limit (e.g., 20Gi) will be rejected. This ensures that users cannot allocate more memory than is safe for the cluster's infrastructure. | ||
|
||
If the cluster administrator does not overwrite the `dask.cluster_max_single_worker_memory` variable, it is set to `8Gi` by default. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# Dask | ||
|
||
REANA supports the integration of Dask clusters to provide scalable, distributed computing capabilities for workflows. This documentation explains how to set up and configure a Dask cluster using REANA which is dedicated for your workflow, query cluster settings, and utilize Dask's features such as the dashboard for monitoring your workflows. | ||
|
||
## Setting up Dask cluster | ||
|
||
To configure your Dask cluster, you can set 3 different variables via `reana.yaml` | ||
|
||
1. **`image` (mandatory)** | ||
Specifies the Docker image to be used by the Dask workers, scheduler, and the job pod that executes your analysis. Ensure the image includes all dependencies required for your workflow. | ||
2. **`number_of_workers` (optional)** | ||
Defines the number of Dask workers for your cluster. If not specified, a default value configured by your REANA cluster administrator will be used. If you request more workers than the allowed maximum, the following error occurs: | ||
|
||
```console | ||
$ reana-client run | ||
... | ||
==> ERROR: Cannot create workflow : | ||
The number of requested Dask workers (N) exceeds the maximum limit (M). | ||
``` | ||
|
||
In such cases, reduce the number of workers and try again. | ||
|
||
3. **`single_worker_memory` (optional)** | ||
Sets the amount of memory allocated to each Dask worker. If not specified, a default value configured by your REANA cluster administrator will be used. Requests exceeding the maximum allowed memory per worker will also result in the following error: | ||
|
||
```console | ||
$ reana-client run | ||
... | ||
==> ERROR: Cannot create workflow : | ||
The "single_worker_memory" provided in the dask resources exceeds the limit (8Gi). | ||
``` | ||
|
||
An example configuration: | ||
|
||
```yaml | ||
--- | ||
resources: | ||
--- | ||
dask: | ||
image: docker.io/coffeateam/coffea-dask-cc7:0.7.22-py3.10-g7f049 | ||
number_of_workers: 3 | ||
single_worker_memory: 4Gi | ||
``` | ||
## Querying Dask Settings and Limits | ||
REANA administrators set some settings and limits about the Dask clusters such as maximum memory limit and whether autoscaler is enabled or not. You can query them with `reana-client info` command. | ||
|
||
```console | ||
$ reana-client info | ||
... | ||
Dask autoscaler enabled in the cluster: True | ||
The number of Dask workers created by default: 2 | ||
The amount of memory used by default by a single Dask worker: 2Gi | ||
The maximum memory limit for Dask clusters created by users: 16Gi | ||
The maximum number of workers that users can ask for the single Dask cluster: 20 | ||
The maximum amount of memory that users can ask for the single Dask worker: 8Gi | ||
Dask workflows allowed in the cluster: True | ||
... | ||
``` | ||
|
||
## Usage | ||
|
||
Once you configure your dask cluster via `reana.yaml`, REANA runs your workflow within an environment and provides the scheduler's uri via environment variable called `DASK_SCHEDULER_URI`. You can use the environment variable as follows: | ||
|
||
```python | ||
... | ||
import os | ||
from dask.distributed import Client | ||
DASK_SCHEDULER_URI = os.getenv("DASK_SCHEDULER_URI") | ||
client = Client(DASK_SCHEDULER_URI) | ||
... | ||
``` | ||
|
||
You can also refer to the demo [here](https://github.com/reanahub/reana-demo-dask-coffea) for a full example which uses Dask with analysis code and `reana.yaml`. | ||
|
||
## Dask Dashboard | ||
|
||
You can inspect your analysis and Dask cluster via Dask dashbard by clicking the following icon. | ||
|
||
SS Here --> Icon URL | ||
SS Here --> Dashboard SS |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters