-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement systemd-boot boot assessment #2864
Comments
I can help test this when someone is ready for testing. I was also thinking about how does the system move from failed active AND passive into recovery or reset. Right now recovery requires human intervention and doesn't load any sysext options, so it has to be pretty bare bones as we are keeping UKI images small. I was thinking about building an auto update script for recovery that runs and tries to fix active/passive by running an upgrade and/or checks a HTTPS website for instructions. It would then not auto update the systemd-boot count for recovery, and instead let active/passive successfully booting reset the count for recovery. This would make sure that if recovery fails to recover the system after X attempts, a reset is triggered which hopefully can do a better job setting every right and blowing away filesystems to clean it up. |
Planning decision: Let's implement the default fallback mechanism of systemd first and then see if we can implement the auto-reset feature using stages and such (extract to different ticket when the first part is done) Being able to auto-reset a system that doesn't boot make sense, especially in cases like:
|
with the given patch it seems to work BUT
2 possible outcomes:
thoughts @kairos-io/maintainers |
Basically this is the expected workflow of the boot assesment for reference: https://systemd.io/AUTOMATIC_BOOT_ASSESSMENT/ Important part below
|
I dont think this works for us, as we need to wait for the We could also have a manual service that runs after systemd multi-user.target
|
mmh complex but doable, the only challenge I see there is to fire the systemd services exactly in that timeframe, not sure if possible if not by calling systemd-bless-boot inside immucore
That looks the most saner solution at this point, however, my only concern here is if systemd-bless-boot will get more business logic from systemd that we might miss. Wouldn't be at this point equivalent to call systemd-bless-boot from immucore directly? |
Yeah after a deeper checking this wont work as the bless is once the system is fully up, so in userspace once systemctl reports everything as running. Out of immucore control unfortunately
Seems like we may be able to do it ourselves by just calling the binary. So mimicking the bless service but with extra steps. Maybe even with a simple override to run pre and post for the mounts. So we dont need to reimplement the whole thing |
That was exactly what I was thinking. We need to modify the path for Maybe changing systemd-bless-boot.service with an override file to have something like:
|
I actually tested this with overrides for mounting unmounting the partition and it worked as expected. I think it gets the path automatically either from identifying the partition type or from the systemd-boot efivars but it do actually works as expected |
With this overrider the boot-bless service works
Notice that we also need to override another service, the boot-random-seed as that its automatically brought and needs write access to efi
|
there is still an issue but we can workaround it with this
So on our The main problem is, that when bless-boot marks a config as good after booting, it renames it to remove the boot assessment, as its marked as good, so So to fix that, we can use the service itself to remove any mentions of the boot assessment part in the I tested this with an I think we can work with this. I will test it further but seems to work as expected. Moving pieces needed to fully implement this:
|
mostly done, only agent PR missing merge and then we can test it once its on the framework and such but locally testing it seems to work as expected |
systemd-boot has a was to perform boot assessment and fallback to other entries if booting fails. It is described in detail here and here. It's not very complicated and only requires us to name the conf/efi files in a certain way and also make sure we order entries properly (so that the right one is picked as a fallback).
Note:
Originally investigated while documenting how Kairos does boot assessment,
The text was updated successfully, but these errors were encountered: