-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add better meta-layer approach #21
Comments
It's good to have a spike for this though I'd like to try to collaborate on an design doc for this.
Shouldn't we just follow https://uapi-group.org/specifications/specs/unified_kernel_image/#locations-for-distribution-built-ukis-installed-by-package-managers ? |
The issue with just putting it in the filesystem is the circular hash dependency thing. It needs to be out of band, either in an artifact or hidden layer. That's the thing that needs to be identified. In the hidden layer, I was actually imagining something like following the bootloader spec. I started to lean against the idea of hardcoding UKI and having a verity digest for that one file. Following BLS actually gives us the ability to also do this without a UKI, which seems like something that someone might like to do. |
In what I'm thinking here, we do:
When we boot it'd be into the rootfs without the uki, but it'd be in that regular place when we're run as a container image. What I don't think we've talked about though is whether and how we want to try to create a chain that allows us to establish trust in the manifest/config starting from the rootfs. I think in the longer term we do need to do that in order to have things like |
Although I don't have anything against this system (where we effectively end up with two fsverity digests — one for the "base" image and one for the chained image) it would substantially complicate the deployment in practice. Here's why: We download a container image that we want to boot into. That means that we need to create a composefs for that container image. This container image had the kernel inside of it, so the composefs that we generate will also have the kernel inside of it. That means that it's going to be the "wrong" fs-verity compared to the one that we actually want to mount. So how do we resolve that? We could say that we also generate a secondary composefs with some files masked out, which is the one that we use strictly for booting? Perhaps anything called Or do we refer to the original base image from our with-kernel image and that's the image that we use to build the composefs from? This gigantic mess is kinda impenetrable... and it's sort of what sent us down this "ephemeral signing key" path in the first place... Something needs to give, and I think the thing needs to be that the UKI doesn't appear as part of the filesystem of the container. The "hidden metadata layer" approach is a massively gigantic hack but it gets the job done. I'd like it if some day we could have a way of attaching the UKI as an OCI artifact instead, but it seems like the tooling is lagging a bit there... As for linking the composefs back to the originating container image, this seems like it would be very difficult to do in a way that didn't introduce cyclic dependencies again. As part of the main thrust of what we're doing here, we want to sign the container image, which includes the fs-verity digest of the compete composefs. If that composefs contains a reference back to the container, we're in trouble again (unless we specify that the digest gets excluded in that computation). Another idea that I had to side-step that issue is to include it in the kernel commandline or other PE section such that it's not part of the filesystem, but this doesn't help either, since (although it doesn't impact the fs-verity digest of the composefs) the entire content of the kernel is still (indirectly) hashed into the container image, affecting its ID. So ya — maybe we invent a hash over a container config where the If we imagine a future where the container is an OCI artifact instead of part of the container image then this could resolve the conflict. Artifacts refer to containers, not the other way around, and we can only find them via the referrer API. I find that sort of unsettling (for example, we could theoretically have multiple kernels for a given container image, depending on what the repository feels like serving us that day). But: the artifact for the UKI would be the thing that needs to be signed... and it also resolves the cyclic dependency, so we could in that case refer to the container ID from the commandline of the kernel. This all feels kinda "far away" though. |
Yes for sure, we'd want a special annotation on the final kernel layer, so that tooling knows to extract and also make a composefs for everything before that layer as well. I don't see this as weird at all - remember in a general image derivation case we'll often have around a composefs for layers 0..N for a base image and another for layers 0..(N+1) for a final derived image.
Can you write out a bit more what this is? There's not many comments in the code and it's not immediately obvious to me how it works in the code right now. |
Yes. This absolutely needs to be better documented, but here we go: We need a way to include a UKI in the filesystem of the container in such a way that the composefs fs-verity digest doesn't change. We can't attach large blobs of data directly to the config of the container. OCI artifacts aren't fully supported in podman yet, (and even if they were, it's not clear if they're the right fit because they're not actually part of the image, but rather associated with it on the container repository). So here's what we do. Take the base image (which we computed the composefs fs-verity digest against) and add two layers to it:
Assuming the incoming container didn't already have a Once we extract the UKI, it will contain a composefs= cmdline parameter that references the composefs of the final container image (which, as mentioned above, is equal to the composefs of the base image). This is how the cyclic hash dependency gets resolved. |
OK, right I think I remember you describing this "add then whiteout". I need to think harder about the security properties of this versus my proposal of "just add a layer". First, do you agree that "bootable verified OCI" should just be a special case of "verified OCI" for apps or other use cases? If so, then in the "add then whiteout" we would have a situation where the config digest == UKI digest right? But in the end what's the advantage of having "config digest == UKI digest"? We can equally well access the UKI in a "just add a layer" approach - here config digest != UKI digest, but I don't see a problem with that; from the system again given a verified config, finding the UKI is just traversing into its standard place in So they seem equivalent from a security PoV. From an elegance/understandable PoV, wouldn't you agree it's just nicer to have e.g. |
Instead of returning the second-highest layer as the meta-layer, we can:
install-kernel
command or so that defaults to unpacking in/boot
. We could use that during the OS image build, but it would also allow for easier "upgrades" on running systems.The text was updated successfully, but these errors were encountered: