Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Support ARM CPUs #2101

Open
jvstme opened this issue Dec 16, 2024 · 3 comments
Open

[Feature]: Support ARM CPUs #2101

jvstme opened this issue Dec 16, 2024 · 3 comments
Labels

Comments

@jvstme
Copy link
Collaborator

jvstme commented Dec 16, 2024

Problem

Many cloud providers offer instances with ARM CPUs, which can sometimes be more cost-efficient. There are also accelerators that are only available along with an ARM CPU, such as the NVIDIA GH200 chip. dstack does not support running jobs on ARM CPUs.

Solution

Support running jobs on ARM CPUs.

Some things to consider:

  • Research running amd64 Docker containers on arm64 hosts and impact on performance. Decide how and if dstack will handle architecture emulation for Docker.
  • Add a build target for dstack-shim and possibly dstack-runner in CI.
  • Distinguish arm64 and amd64 offers in gpuhunt and dstack and choose appropriate shim and runner builds when provisioning instances and starting jobs.
  • Provide a way to specify the CPU architecture in on-prem fleets or detect it automatically.
  • Provide a way to filter offers by architecture.
  • Add ARM offers for at least one cloud provider.
  • (?) Build dstack OS images for ARM.
  • (?) Build dstack Docker images for ARM.

Workaround

No response

Would you like to help us implement this feature by sending a PR?

Yes

@jvstme jvstme added the feature label Dec 16, 2024
@peterschmidt85 peterschmidt85 pinned this issue Dec 23, 2024
@peterschmidt85 peterschmidt85 changed the title [Feature]: Support ARM CPUs for jobs [Feature]: Support ARM CPUs Dec 23, 2024
Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label Jan 23, 2025
Copy link

github-actions bot commented Feb 6, 2025

This issue was closed because it has been inactive for 14 days since being marked as stale. Please reopen the issue if it is still relevant.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 6, 2025
@peterschmidt85 peterschmidt85 reopened this Feb 7, 2025
@github-actions github-actions bot removed the stale label Feb 8, 2025
@un-def
Copy link
Collaborator

un-def commented Feb 19, 2025

Research running amd64 Docker containers on arm64 hosts and impact on performance

One option is to use QEMU via bitfmt_misc handler, e.g., https://github.com/dbhi/qus

Performance penalty seems to be substantial. sysbench cpu run on the same AWS Graviton machine, on Ubuntu inside Docker container:

  • natively
CPU speed:
    events per second:  2808.17

General statistics:
    total time:                          10.0001s
    total number of events:              28086

Latency (ms):
         min:                                    0.35
         avg:                                    0.36
         max:                                    2.36
         95th percentile:                        0.36
         sum:                                 9995.26

Threads fairness:
    events (avg/stddev):           28086.0000/0.00
    execution time (avg/stddev):   9.9953/0.00
  • x86 image via QEMU
CPU speed:
    events per second:   385.74

General statistics:
    total time:                          10.0028s
    total number of events:              3862

Latency (ms):
         min:                                    2.56
         avg:                                    2.59
         max:                                    5.15
         95th percentile:                        2.66
         sum:                                 9985.28

Threads fairness:
    events (avg/stddev):           3862.0000/0.00
    execution time (avg/stddev):   9.9853/0.00

@peterschmidt85 peterschmidt85 unpinned this issue Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants