Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libct/cg/sd: set the DeviceAllow property before DevicePolicy #4569

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wenjianhn
Copy link

Every unit created by runc need daemon reload since systemd v230. This breaks support for NVIDIA GPUs, see
#3708 (comment)

Add a workaround for the below systemd issue.
systemd/systemd#35710

Instead of filling the empty DeviceAllow array, a new array is created with allowed devices. Remove the comment about it, since it's misleading.

Closes #4568

Every unit created by runc need daemon reload since systemd v230.
This breaks support for NVIDIA GPUs, see
opencontainers#3708 (comment)

Add a workaround for the below systemd issue.
systemd/systemd#35710

Instead of filling the empty DeviceAllow array, a new array is created
with allowed devices. Remove the comment about it, since it's misleading.

Closes opencontainers#4568

Signed-off-by: Jian Wen <[email protected]>
@kolyshkin
Copy link
Contributor

Instead of filling the empty DeviceAllow array, a new array is created
with allowed devices. Remove the comment about it, since it's misleading.

Indeed the comment (// Empty the DeviceAllow array before filling it., added by commit b810da1) is misleading -- we do not fill this entry but add another one after it.

Yet it's kind of weird how the order of entries can change the systemd behavior. Summoning @cyphar who may shed some light.

@cyphar
Copy link
Member

cyphar commented Dec 23, 2024

Yeah, the need for this fix is incredibly strange -- systemd's whole declarative design should mean that the order of properties in a .unit (even a transient unit) should not matter and the fact it does seems like a systemd bug. Then again, this wouldn't be the first time we've been hit with weird bugs in transient units...

Ultimately, I don't mind taking this patch (since it's conceptually a no-op but seems to be a systemd bug we are working around) but we can't be sure that tomorrow systemd won't start having issues with this in a different way... I also am a little concerned (given that systemd seems to not apply these rules in a declarative way) that setting the DevicePolicy after DeviceAllow might change the DeviceAllow list (though that seems unlikely)...

Indeed the comment (// Empty the DeviceAllow array before filling it., added by commit b810da1) is misleading -- we do not fill this entry but add another one after it.

Maybe the comment is a bit poorly worded, but what I was trying to say is that we are clearing systemd's internal DeviceAllow array. If you just had a DeviceAllow=/dev/foo entry, this would be appended to the existing set of allowed devices for the unit (for new units this would include the default devices allowed by systemd IIRC, and on updates it would include all allow devices in previous configurations meaning you couldn't block access to a device with runc update). The only way to clear the DeviceAllow set is to set it to an empty array first.

If you feel the comment is confusing, we can change it, but it shouldn't be removed entirely -- this behaviour from systemd is quite subtle and deserves a comment.

(At least, that was my understanding of the systemd.unit documentation when I wrote this, I don't remember if not adding this code caused issues, maybe the DBus API for systemd is a pure setter API? I'm not sure. The commit description says that this is not the case, so I guess I did test it? I don't remember.)

@wenjianhn
Copy link
Author

@cyphar how about the below comment

// To reset the configured systemd DeviceAllow array set the property to an empty array first, then append to the systemd array
// See SetUnitProperties() of https://www.freedesktop.org/wiki/Software/systemd/dbus/

@cyphar
Copy link
Member

cyphar commented Dec 23, 2024

I would prefer a link to the DeviceAllow= docs that mention this behaviour, but I just looked and it seems this behaviour is not explicitly documented. This is just the general behaviour of these list-style options of systemd...

I think something like this would be clearer (even though it's a bit longer):

// When we later add DeviceAllow=/dev/foo properties, we are appending devices
// to the allow list for the unit. However, if this is an existing unit, it
// already has DeviceAllow= entries, and we need to clear them all before
// applying the new set. (We also do this for new units, mainly for safety to
// ensure we only enable the devices we expect.) To clear any existing
// DeviceAllow= rules, we have to add an empty DeviceAllow= property.
newProp("DeviceAllow", []deviceAllowEntry{}),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Every unit created by runc need daemon reload since systemd v230.
3 participants