[Feature]: Multi-node tasks with `placement: any` #2122

jvstme · 2024-12-19T12:08:42Z

Problem

Multi-node tasks can only run on fleets with placement: cluster (source code), which means all nodes must be in the same backend, region, and network.

Some distributed workloads don't require network connectivity between nodes. For example, worker nodes in a distributed data processing workload may fetch data from an external source and upload the processing results to the same source, without ever communicating to other worker nodes or even knowing that other nodes exist.

Currently, it is not possible to run such workloads on backends that don't support private networks (CUDO, DataCrunch, Lambda, RunPod, TensorDock, Vast.ai, Kubernetes) or to run them across backends and regions to optimize costs.

Solution

Allow to specify placement in task configurations. placement: cluster is the current behavior (nodes must be interconnected), while placement: any allows non-interconnected nodes across backends and regions.

The cluster-specific environment variables DSTACK_MASTER_NODE_IP and DSTACK_NODES_IPS are only available with placement: cluster.

The default is placement: cluster for backward compatibility.

Workaround

Multiple single-node runs.

Would you like to help us implement this feature by sending a PR?

Yes

The text was updated successfully, but these errors were encountered:

github-actions · 2025-01-19T02:00:06Z

This issue is stale because it has been open for 30 days with no activity.

github-actions · 2025-02-03T01:57:16Z

This issue was closed because it has been inactive for 14 days since being marked as stale. Please reopen the issue if it is still relevant.

jvstme added the feature label Dec 19, 2024

github-actions bot added the stale label Jan 19, 2025

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 3, 2025

jvstme reopened this Feb 3, 2025

github-actions bot removed the stale label Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Multi-node tasks with `placement: any` #2122

[Feature]: Multi-node tasks with `placement: any` #2122

jvstme commented Dec 19, 2024

github-actions bot commented Jan 19, 2025

github-actions bot commented Feb 3, 2025

[Feature]: Multi-node tasks with placement: any #2122

[Feature]: Multi-node tasks with placement: any #2122

Comments

jvstme commented Dec 19, 2024

Problem

Solution

Workaround

Would you like to help us implement this feature by sending a PR?

github-actions bot commented Jan 19, 2025

github-actions bot commented Feb 3, 2025

[Feature]: Multi-node tasks with `placement: any` #2122

[Feature]: Multi-node tasks with `placement: any` #2122