chore(scripts): check Docker nvidia runtime before enabling GPU compose#16562
chore(scripts): check Docker nvidia runtime before enabling GPU compose#16562
Conversation
This comment has been minimized.
This comment has been minimized.
nvidia-smi being present only means the NVIDIA driver is installed, not that the NVIDIA Container Toolkit is configured for Docker. This caused "could not select device driver nvidia" errors on Linux machines with GPU drivers but without nvidia-container-toolkit.
4e2e7ef to
0220038
Compare
Codeowners resolved as |
Performance SLOsComparing candidate alex/fix-ddtest-nvidia-detection (0220038) with baseline main (0eb8b71) 📈 Performance Regressions (2 suites)📈 iastaspects - 118/118✅ add_aspectTime: ✅ 103.145µs (SLO: <130.000µs 📉 -20.7%) vs baseline: +1.0% Memory: ✅ 42.920MB (SLO: <46.000MB -6.7%) vs baseline: +4.5% ✅ add_inplace_aspectTime: ✅ 100.898µs (SLO: <130.000µs 📉 -22.4%) vs baseline: -1.5% Memory: ✅ 42.959MB (SLO: <46.000MB -6.6%) vs baseline: +4.4% ✅ add_inplace_noaspectTime: ✅ 28.494µs (SLO: <40.000µs 📉 -28.8%) vs baseline: +1.5% Memory: ✅ 42.900MB (SLO: <46.000MB -6.7%) vs baseline: +4.4% ✅ add_noaspectTime: ✅ 48.783µs (SLO: <70.000µs 📉 -30.3%) vs baseline: -0.6% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.5% ✅ bytearray_aspectTime: ✅ 250.276µs (SLO: <400.000µs 📉 -37.4%) vs baseline: +1.5% Memory: ✅ 42.979MB (SLO: <46.000MB -6.6%) vs baseline: +4.4% ✅ bytearray_extend_aspectTime: ✅ 635.162µs (SLO: <800.000µs 📉 -20.6%) vs baseline: -2.6% Memory: ✅ 42.979MB (SLO: <46.000MB -6.6%) vs baseline: +4.2% ✅ bytearray_extend_noaspectTime: ✅ 262.607µs (SLO: <400.000µs 📉 -34.3%) vs baseline: -2.6% Memory: ✅ 42.920MB (SLO: <46.000MB -6.7%) vs baseline: +4.4% ✅ bytearray_noaspectTime: ✅ 136.923µs (SLO: <300.000µs 📉 -54.4%) vs baseline: -1.3% Memory: ✅ 42.959MB (SLO: <46.000MB -6.6%) vs baseline: +4.5% ✅ bytes_aspectTime: ✅ 217.035µs (SLO: <300.000µs 📉 -27.7%) vs baseline: -1.4% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.2% ✅ bytes_noaspectTime: ✅ 132.080µs (SLO: <200.000µs 📉 -34.0%) vs baseline: -1.8% Memory: ✅ 42.959MB (SLO: <46.000MB -6.6%) vs baseline: +4.5% ✅ bytesio_aspectTime: ✅ 3.767ms (SLO: <5.000ms 📉 -24.7%) vs baseline: -0.5% Memory: ✅ 42.959MB (SLO: <46.000MB -6.6%) vs baseline: +4.4% ✅ bytesio_noaspectTime: ✅ 313.798µs (SLO: <420.000µs 📉 -25.3%) vs baseline: -1.6% Memory: ✅ 42.900MB (SLO: <46.000MB -6.7%) vs baseline: +4.4% ✅ capitalize_aspectTime: ✅ 89.231µs (SLO: <300.000µs 📉 -70.3%) vs baseline: +0.6% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.3% ✅ capitalize_noaspectTime: ✅ 251.644µs (SLO: <300.000µs 📉 -16.1%) vs baseline: -0.1% Memory: ✅ 42.900MB (SLO: <46.000MB -6.7%) vs baseline: +4.5% ✅ casefold_aspectTime: ✅ 89.323µs (SLO: <500.000µs 📉 -82.1%) vs baseline: -0.2% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.1% ✅ casefold_noaspectTime: ✅ 308.670µs (SLO: <500.000µs 📉 -38.3%) vs baseline: -0.5% Memory: ✅ 42.880MB (SLO: <46.000MB -6.8%) vs baseline: +4.3% ✅ decode_aspectTime: ✅ 86.700µs (SLO: <100.000µs 📉 -13.3%) vs baseline: ~same Memory: ✅ 42.920MB (SLO: <46.000MB -6.7%) vs baseline: +4.3% ✅ decode_noaspectTime: ✅ 152.967µs (SLO: <210.000µs 📉 -27.2%) vs baseline: -0.7% Memory: ✅ 42.900MB (SLO: <46.000MB -6.7%) vs baseline: +4.4% ✅ encode_aspectTime: ✅ 84.566µs (SLO: <200.000µs 📉 -57.7%) vs baseline: +0.2% Memory: ✅ 42.861MB (SLO: <46.000MB -6.8%) vs baseline: +4.1% ✅ encode_noaspectTime: ✅ 140.851µs (SLO: <200.000µs 📉 -29.6%) vs baseline: -1.3% Memory: ✅ 42.880MB (SLO: <46.000MB -6.8%) vs baseline: +4.2% ✅ format_aspectTime: ✅ 14.678ms (SLO: <19.200ms 📉 -23.6%) vs baseline: +0.4% Memory: ✅ 42.979MB (SLO: <46.000MB -6.6%) vs baseline: +4.2% ✅ format_map_aspectTime: ✅ 16.423ms (SLO: <21.500ms 📉 -23.6%) vs baseline: +0.1% Memory: ✅ 42.979MB (SLO: <46.000MB -6.6%) vs baseline: +4.2% ✅ format_map_noaspectTime: ✅ 370.215µs (SLO: <500.000µs 📉 -26.0%) vs baseline: -1.1% Memory: ✅ 42.900MB (SLO: <46.000MB -6.7%) vs baseline: +4.3% ✅ format_noaspectTime: ✅ 300.824µs (SLO: <500.000µs 📉 -39.8%) vs baseline: -1.9% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.3% ✅ index_aspectTime: ✅ 124.731µs (SLO: <300.000µs 📉 -58.4%) vs baseline: +2.5% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.8% ✅ index_noaspectTime: ✅ 40.148µs (SLO: <300.000µs 📉 -86.6%) vs baseline: -0.3% Memory: ✅ 42.900MB (SLO: <46.000MB -6.7%) vs baseline: +4.2% ✅ join_aspectTime: ✅ 210.905µs (SLO: <300.000µs 📉 -29.7%) vs baseline: -1.3% Memory: ✅ 42.959MB (SLO: <46.000MB -6.6%) vs baseline: +4.6% ✅ join_noaspectTime: ✅ 141.028µs (SLO: <300.000µs 📉 -53.0%) vs baseline: -1.8% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.5% ✅ ljust_aspectTime: ✅ 585.866µs (SLO: <700.000µs 📉 -16.3%) vs baseline: 📈 +15.6% Memory: ✅ 42.959MB (SLO: <46.000MB -6.6%) vs baseline: +4.6% ✅ ljust_noaspectTime: ✅ 256.327µs (SLO: <300.000µs 📉 -14.6%) vs baseline: -1.3% Memory: ✅ 42.920MB (SLO: <46.000MB -6.7%) vs baseline: +4.4% ✅ lower_aspectTime: ✅ 294.228µs (SLO: <500.000µs 📉 -41.2%) vs baseline: -2.6% Memory: ✅ 42.900MB (SLO: <46.000MB -6.7%) vs baseline: +4.0% ✅ lower_noaspectTime: ✅ 235.249µs (SLO: <300.000µs 📉 -21.6%) vs baseline: +0.4% Memory: ✅ 42.900MB (SLO: <46.000MB -6.7%) vs baseline: +4.3% ✅ lstrip_aspectTime: ✅ 0.270ms (SLO: <3.000ms 📉 -91.0%) vs baseline: -3.2% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.7% ✅ lstrip_noaspectTime: ✅ 0.177ms (SLO: <3.000ms 📉 -94.1%) vs baseline: -0.8% Memory: ✅ 42.979MB (SLO: <46.000MB -6.6%) vs baseline: +4.4% ✅ modulo_aspectTime: ✅ 14.403ms (SLO: <18.750ms 📉 -23.2%) vs baseline: +0.2% Memory: ✅ 43.018MB (SLO: <46.000MB -6.5%) vs baseline: +4.4% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 14.763ms (SLO: <19.350ms 📉 -23.7%) vs baseline: -0.1% Memory: ✅ 43.057MB (SLO: <46.000MB -6.4%) vs baseline: +4.3% ✅ modulo_aspect_for_bytesTime: ✅ 14.420ms (SLO: <18.900ms 📉 -23.7%) vs baseline: +0.3% Memory: ✅ 43.037MB (SLO: <46.000MB -6.4%) vs baseline: +4.6% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 14.557ms (SLO: <19.150ms 📉 -24.0%) vs baseline: -0.4% Memory: ✅ 43.018MB (SLO: <46.000MB -6.5%) vs baseline: +4.2% ✅ modulo_noaspectTime: ✅ 0.362ms (SLO: <3.000ms 📉 -87.9%) vs baseline: +0.3% Memory: ✅ 42.959MB (SLO: <46.000MB -6.6%) vs baseline: +4.5% ✅ replace_aspectTime: ✅ 18.461ms (SLO: <24.000ms 📉 -23.1%) vs baseline: +0.5% Memory: ✅ 42.920MB (SLO: <46.000MB -6.7%) vs baseline: +4.2% ✅ replace_noaspectTime: ✅ 281.970µs (SLO: <300.000µs -6.0%) vs baseline: -0.8% Memory: ✅ 42.920MB (SLO: <46.000MB -6.7%) vs baseline: +4.4% ✅ repr_aspectTime: ✅ 311.142µs (SLO: <420.000µs 📉 -25.9%) vs baseline: -3.3% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.5% ✅ repr_noaspectTime: ✅ 47.031µs (SLO: <90.000µs 📉 -47.7%) vs baseline: +0.6% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.3% ✅ rstrip_aspectTime: ✅ 380.828µs (SLO: <500.000µs 📉 -23.8%) vs baseline: -1.0% Memory: ✅ 42.959MB (SLO: <46.000MB -6.6%) vs baseline: +4.6% ✅ rstrip_noaspectTime: ✅ 183.350µs (SLO: <300.000µs 📉 -38.9%) vs baseline: -0.8% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.5% ✅ slice_aspectTime: ✅ 186.420µs (SLO: <300.000µs 📉 -37.9%) vs baseline: +2.4% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.5% ✅ slice_noaspectTime: ✅ 54.232µs (SLO: <90.000µs 📉 -39.7%) vs baseline: +1.0% Memory: ✅ 42.920MB (SLO: <46.000MB -6.7%) vs baseline: +4.3% ✅ stringio_aspectTime: ✅ 4.398ms (SLO: <5.000ms 📉 -12.0%) vs baseline: 📈 +14.8% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.5% ✅ stringio_noaspectTime: ✅ 347.076µs (SLO: <500.000µs 📉 -30.6%) vs baseline: -0.7% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.3% ✅ strip_aspectTime: ✅ 270.583µs (SLO: <350.000µs 📉 -22.7%) vs baseline: -1.3% Memory: ✅ 42.979MB (SLO: <46.000MB -6.6%) vs baseline: +4.4% ✅ strip_noaspectTime: ✅ 176.506µs (SLO: <240.000µs 📉 -26.5%) vs baseline: -1.0% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.2% ✅ swapcase_aspectTime: ✅ 332.558µs (SLO: <500.000µs 📉 -33.5%) vs baseline: -2.0% Memory: ✅ 42.920MB (SLO: <46.000MB -6.7%) vs baseline: +4.3% ✅ swapcase_noaspectTime: ✅ 271.326µs (SLO: <400.000µs 📉 -32.2%) vs baseline: -2.1% Memory: ✅ 42.939MB (SLO: <46.000MB -6.7%) vs baseline: +4.3% ✅ title_aspectTime: ✅ 319.454µs (SLO: <500.000µs 📉 -36.1%) vs baseline: -5.2% Memory: ✅ 42.959MB (SLO: <46.000MB -6.6%) vs baseline: +4.4% ✅ title_noaspectTime: ✅ 258.521µs (SLO: <400.000µs 📉 -35.4%) vs baseline: -1.2% Memory: ✅ 42.959MB (SLO: <46.000MB -6.6%) vs baseline: +4.3% ✅ translate_aspectTime: ✅ 491.107µs (SLO: <700.000µs 📉 -29.8%) vs baseline: -1.6% Memory: ✅ 42.841MB (SLO: <46.000MB -6.9%) vs baseline: +4.1% ✅ translate_noaspectTime: ✅ 425.205µs (SLO: <500.000µs 📉 -15.0%) vs baseline: -1.3% Memory: ✅ 42.861MB (SLO: <46.000MB -6.8%) vs baseline: +4.2% ✅ upper_aspectTime: ✅ 295.670µs (SLO: <500.000µs 📉 -40.9%) vs baseline: -2.0% Memory: ✅ 42.959MB (SLO: <46.000MB -6.6%) vs baseline: +4.4% ✅ upper_noaspectTime: ✅ 235.089µs (SLO: <400.000µs 📉 -41.2%) vs baseline: +0.2% Memory: ✅ 42.920MB (SLO: <46.000MB -6.7%) vs baseline: +4.4% 📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 508.031µs (SLO: <700.000µs 📉 -27.4%) vs baseline: 📈 +19.0% Memory: ✅ 42.900MB (SLO: <46.000MB -6.7%) vs baseline: +5.2% ✅ ospathbasename_noaspectTime: ✅ 430.158µs (SLO: <700.000µs 📉 -38.5%) vs baseline: -1.2% Memory: ✅ 42.684MB (SLO: <46.000MB -7.2%) vs baseline: +4.7% ✅ ospathjoin_aspectTime: ✅ 623.648µs (SLO: <700.000µs 📉 -10.9%) vs baseline: -0.4% Memory: ✅ 42.782MB (SLO: <46.000MB -7.0%) vs baseline: +4.7% ✅ ospathjoin_noaspectTime: ✅ 633.166µs (SLO: <700.000µs -9.5%) vs baseline: +0.5% Memory: ✅ 42.703MB (SLO: <46.000MB -7.2%) vs baseline: +5.0% ✅ ospathnormcase_aspectTime: ✅ 349.352µs (SLO: <700.000µs 📉 -50.1%) vs baseline: -0.4% Memory: ✅ 42.959MB (SLO: <46.000MB -6.6%) vs baseline: +5.3% ✅ ospathnormcase_noaspectTime: ✅ 357.890µs (SLO: <700.000µs 📉 -48.9%) vs baseline: -0.1% Memory: ✅ 42.605MB (SLO: <46.000MB -7.4%) vs baseline: +3.8% ✅ ospathsplit_aspectTime: ✅ 490.425µs (SLO: <700.000µs 📉 -29.9%) vs baseline: -0.8% Memory: ✅ 42.605MB (SLO: <46.000MB -7.4%) vs baseline: +4.3% ✅ ospathsplit_noaspectTime: ✅ 498.672µs (SLO: <700.000µs 📉 -28.8%) vs baseline: -0.3% Memory: ✅ 42.566MB (SLO: <46.000MB -7.5%) vs baseline: +4.3% ✅ ospathsplitdrive_aspectTime: ✅ 377.093µs (SLO: <700.000µs 📉 -46.1%) vs baseline: +0.6% Memory: ✅ 42.605MB (SLO: <46.000MB -7.4%) vs baseline: +4.6% ✅ ospathsplitdrive_noaspectTime: ✅ 72.667µs (SLO: <700.000µs 📉 -89.6%) vs baseline: ~same Memory: ✅ 42.605MB (SLO: <46.000MB -7.4%) vs baseline: +4.4% ✅ ospathsplitext_aspectTime: ✅ 456.817µs (SLO: <700.000µs 📉 -34.7%) vs baseline: -0.6% Memory: ✅ 42.900MB (SLO: <46.000MB -6.7%) vs baseline: +4.9% ✅ ospathsplitext_noaspectTime: ✅ 465.239µs (SLO: <700.000µs 📉 -33.5%) vs baseline: -0.2% Memory: ✅ 42.684MB (SLO: <46.000MB -7.2%) vs baseline: +4.3%
|
|
/merge |
|
View all feedbacks in Devflow UI.
This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
alexandre.choura@datadoghq.com unqueued this merge request |
|
/remove |
|
View all feedbacks in Devflow UI.
|
Description
nvidia-smi being present only means the NVIDIA driver is installed, not that the NVIDIA Container Toolkit is configured for Docker. This caused "could not select device driver nvidia" errors on Linux machines with GPU drivers but without nvidia-container-toolkit.
Testing
The vLLM tests run and pass on the CI (after re-enabling them 😛), and Santiago confirmed locally that the above error doesn't appear anymore.
Risks
Additional Notes