-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Graceful System-Suspend Support #35
Open
arthurt
wants to merge
8
commits into
heftig:master
Choose a base branch
from
arthurt:suspend-resume
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
To avoid confusion, differentiate between the scheduling priority used by non-realtime scheduling to calculate dynamic pritoriy (aka nice value), and the static scheduling priority used for realtime schedulers. Renaming argument names in a DBus interface is fully compatible. Also makes rtkit.c and rtkit-daemon.c agree.
Implement the suspend and resume functionality to temporarily demote and restore managed thread priorities. Priorities of managed threads are remembered when granted. On suspend, all managed threads are demoted, and the canary stopped. While suspended new promotion requests are rejected. Managed threads are still garbage collected, but the lack of a promoted priority is ignored. Reset removes all managed threads, leaving no threads to re-promote on resume. On resume the canary is restarted, and all managed threads are re-promoted, heeding but not enforcing the user burst limit. Suspend and Resume are only available to admin callers, preventing abuse. Notwithstanding, if a malicious users was able to call suspend and resume at will, they still could not circumvent the count or burst limiting. No new threads promotions can be created during when suspended. Further, while the user burst limit is not enforced on resume, it is still updated, and the burst time window is restarted on resume.
Handling org.freedesktop.DBus.Properties and org.freedesktop.DBus.Introspection messages causes a debug log message about the number of monitored threads, despite these interfaces not being able to add threads. As these interfaces may be called frequently by other bus uses, skip printing the debug message in these cases.
Add support for logind system suspend delay and signalling using DBus. See https://www.freedesktop.org/wiki/Software/systemd/inhibit/ This adds a race-free way to suspend priorities when the system is going to sleep, and resume them when the system wakes up. The main reason for suspending priorities is the canary can be stopped, as a system suspend-resume cycle frequently causes a false-positive of the canary.
I wonder if rtkit should just be forked into an organization somewhere on Freedesktop.org's GitLab, just so we can keep on using it. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR address the issue of canary false-positives cased by system-suspend by adding new public methods of "Suspend()" and "Resume()", as well as optionally connecting them to
logind
system-sleep handling.The Bug
During a system suspend-resume (sleep) cycle, the canary thread often experiences a time jump which causes a starvation false-positive.
rtkit
takes action and demotes the realtime/high priority of all known threads.Long running realtime processes (Pipewire, Pulseaudio) generally only request realtime/high priority once. If a system goes to sleep, the realtime/high priority scheduling is lost until these long-running processes are next started, after logout and login. As users generally suspend their machines more often than logging in, rtkit is basically non-functional for these processes, arguably the most important processes to use rtkit.
Even non-long-running processes may have lifecycles which span system suspend-resume cycles, and so operate in a degraded way for users.
See
Why
With the view that the primary bug this change seeks to address is the canary false positives, it would seem to be far simpler to only start and stop use of the the canary during suspend. However, doing so would degrade security for a controllable window. From a security perspective, one might as well just disable the canary altogether. To safely disable the canary, we need to first demote all threads.
Suspend/Resume Operation
Two new admin operations are added to rtkit.
org.freedesktop.RealtimeKit1.Suspend()
,rtkitctl --suspend
org.freedesktop.RealtimeKit1.Resume()
,rtkitctl --resume
These temporarily demote and restore managed thread priorities, as well as stop and start the canary.
On
Suspend()
, all managed threads are demoted, and the canary stopped.While suspended, new realtime/high priority requests are rejected. Managed thread states are still garbage if a thread exists, but are retained otherwise.
On
Resume()
the canary is restarted, and all managed threads are re-promoted. Current user burst limit timeouts are restarted, and the re-promotion of threads counts toward burst limiting, but the burst limit is not enforced on the re-promotion.Calling
ResetKnown()
orResetAll()
while suspended removes all managed threads which lack realtime/high-priority, leaving no threads to re-promote later.Calling either
Suspend()
andResume()
multiple times in a row is fine, but only the first call has an effect.Security Considerations
Suspend()
andResume()
are only available to admin callers, preventing abuse. Notwithstanding, if a malicious user was able to call suspend and resume at will, they still could not circumvent the count or burst limits. No new threads promotions can be created when suspended. Further, while the user burst limit is not enforced on resume, it is still updated, and the burst timeout restarted.It may be safe to allow for new realtime/high priority grants while in suspended mode to take effect upon resume, but this is an unlikely case, so it's easier to just refuse.
logind
IntegrationThis change also adds an optional runtime integration with logind's inhibitor locks for handling system-suspend.
If the
logind
dbus service is running and accessible,rtkit
will register a "delay sleep inhibitor", and listen for signals from logind about when the system is going to sleep or having just woken up. Using the sleep inhibitor, logind will wait for rtkit to perform it'sSuspend()
operation before letting the system suspend. On system resume, logind will again notify rtkit, which will performResume()
and register a new inhibitor.See https://www.freedesktop.org/wiki/Software/systemd/inhibit/
Alternate Integrations
No alternate automatic system-suspend integration is provided, but
rtkitctl --suspend
andrtkitctl --resume
should make this task easy.Other Changes
Rename
priority
(dynamic) tonice_level
inside ofprocess_set_high_priority()
. Helps differentiate it frompriority
(static) as used byprocess_set_realtime()
. Also, it's callednice_level
everywhere else in the code.Reduce log spam by not printing a message for every handled dbus message, as that includes dbus introspection and properties related messages. Some programs (Firefox in my case) get rtkit properties more frequently than I would think necessary.