Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition #9

Open
illiliti opened this issue Jul 14, 2022 · 18 comments
Open

Race condition #9

illiliti opened this issue Jul 14, 2022 · 18 comments

Comments

@illiliti
Copy link

illiliti commented Jul 14, 2022

Due to lack of synchronization, a race condition may occur when e.g other programs/services need rw access to /dev/input/*. Since mdevd doesn't provide a way to request its state of processing coldplug uevents, those programs/services have no synchronization with mdevd which may lead to breakage due to bad permissions in /dev/input/*.

This is tricky issue to fix i think. Udev has udevadm settle which blocks until all events are processed. Mdev has mdev -s which set proper permissions in /dev/* without need for daemon. What mdevd should do? Looking forward to your ideas!

@skarnet
Copy link
Owner

skarnet commented Jul 14, 2022

Well, that's in part what the -O option - that you asked me for - does: the uevent only gets rebroadcast when mdevd is done handling it. So with -O 4, if you receive a /dev/input uevent on netlink group 4, it means that anything related to /dev/input in /etc/mdev.conf has already been processed. Is there another point of synchronization that you need? Or is it only a question of interfaces?

@illiliti
Copy link
Author

-O fixes race condition for hotplugged devices and not really related to this topic.

Consider the following example:

#!/bin/sh
# this is /sbin/init

# some initialization is going on here

mdevd & # this doesn't block
mdevd-coldplug # trigger uevents for cold-plugged devices
# synchronization is needed here
sway # this will fail because mdevd doesn't have enough time to handle uevents and set permissions for devnodes

Sway(well libinput) uses /dev/input to initialize cold-plugged devices. If it encounter devnode with bad permissions(because of race condition), it will exit(1). In order to fix this, some kind of synchronization(like udevadm settle) is needed between mdevd-coldplug and sway to ensure that all uevents are processed before running sway.

@skarnet
Copy link
Owner

skarnet commented Jul 14, 2022

Technically, sway doesn't need all uevents to be processed; it just needs the /dev/input processing to be done. So you could get by with -O. 😛

But yes, I understand that what you want is an equivalent to udevadm settle, for simplicity's sake. It is pretty difficult to implement, but would it work for you if mdevd-coldplug had an option that made it synchronous, i.e. it would only exit when the last coldplug uevent has been processed? I may have an idea for this.

@illiliti
Copy link
Author

Technically, sway doesn't need all uevents to be processed; it just needs the /dev/input processing to be done. So you could get by with -O. stuck_out_tongue

Sway is just an example. There are also potential problems with kernel modules and modprobe due to this race condition.

would it work for you if mdevd-coldplug had an option that made it synchronous, i.e. it would only exit when the last coldplug uevent has been processed? I may have an idea for this.

Maybe. Depends on how you want to implement it.

@skarnet
Copy link
Owner

skarnet commented Jul 14, 2022

The idea would be to make mdevd-coldplug listen to the netlink group mdevd is rebroadcasting too, and only exit when it gets an uevent for the last device it triggered.

@illiliti
Copy link
Author

This is interesting. I wonder how you intend to determine that last device.

@illiliti
Copy link
Author

I wonder how you intend to determine that last device.

I found something interesting: https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-uevent

Turns out you can pass your own variables to uevent. This could be leveraged to implement detection of the last device.

@skarnet
Copy link
Owner

skarnet commented Jul 16, 2022

Oh, I don't think you need this. You just need to record the DEVPATH of the last device (which can be trivially achieved by keeping the DEVPATH of every device in a variable that gets overwritten every loop iteration, and the last value when the loop exits is the correct one).

@illiliti
Copy link
Author

Good idea

@skarnet
Copy link
Owner

skarnet commented Aug 3, 2022

@illiliti Can you test with the latest git head? mdevd-coldplug -O4 should block until the last event has been processed, if you're running mdevd -O4 as well.

@illiliti
Copy link
Author

illiliti commented Aug 4, 2022

I got segfault at

if (!strcmp(x, mdev)) break ;

gdb says x is 0x0

@skarnet
Copy link
Owner

skarnet commented Aug 4, 2022

That's worrying: is there a path where the MDEV variable is not set by mdevd? Can I see your mdev.conf file? Thanks for the report, I'll have a look.

@illiliti
Copy link
Author

illiliti commented Aug 4, 2022

is there a path where the MDEV variable is not set by mdevd?

/devices/system/memory/memory32. That's where it segfaults. It has DEVPATH but not MDEV it seems

Can I see your mdev.conf file?

https://github.com/kiss-community/repo/blob/master/extra/mdevd/files/mdevd.conf

Plus

$MODALIAS=.* root:root 660 @modprobe -q "$MODALIAS"

@skarnet
Copy link
Owner

skarnet commented Aug 4, 2022

Yeah that makes sense. The MDEV variable is actually only supposed to be exported in spawned helper commands, it doesn't belong to the uevent itself, and the fact that it was visible in the rebroadcasted events was a bug. My own mdev.conf was hiding the issue.

I removed MDEV from the rebroadcasted events and made mdevd-coldplug test against the last part of DEVPATH instead. Can you please test this version?

@illiliti
Copy link
Author

illiliti commented Aug 4, 2022

I don't see segfault anymore, but race condition is still present.

@skarnet
Copy link
Owner

skarnet commented Aug 4, 2022

That doesn't sound possible, or points to a more fundamental design issue.

  • mdevd -O4 rebroadcasts uevents to nlgroup 4 after it has finished processing them.
  • mdevd-coldplug -O4 triggers everything in /sys and waits until the last triggered uevent has been rebroadcast to nlgroup 4.
    You can check with strace that it's exactly what it does.

If you're still seeing the race condition, then either I'm missing something fundamental about how all this is working, or there's something else in your workflow that makes the uevent appear ready earlier than it should.

@skarnet
Copy link
Owner

skarnet commented Aug 4, 2022

@illiliti What steps could I take to reproduce the problem without firing up a graphical interface?

@illiliti
Copy link
Author

illiliti commented Aug 4, 2022

I guess you have to modify your init. That's what i'm doing

Anyway, i fixed race condition by running mdevd-coldplug twice. I have no idea whether it's proper fix though. Maybe running it twice gives mdevd more time to handles uevents, hence no race. I don't know

I'll investigate this a bit more, but if it is proper fix, then linux hotplugging/coldplugging is totally fucked. I don't even want to look how udev handles this cursed stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants