Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildah 1.38.1 build "FROM oci-archive:" fails to find content generated inside a build #5952

Open
henrywang opened this issue Jan 29, 2025 · 43 comments
Assignees
Labels
jira Issues which will be sync'd to a card at https://issues.redhat.com/projects/RUN

Comments

@henrywang
Copy link

quay.io/buildah/stable:v1.38.1 failed to build image with FROM oci-archive:./out.ociarchive and got error:

[2/2] STEP 1/3: FROM oci-archive:./out.ociarchive
Error: creating build container: creating temp directory: archive file not found: "/builds/redhat/rhel/bifrost/compose-bootc/out.ociarchive"

quay.io/buildah/stable:v1.38.0 does not have this issue.

@vrothberg
Copy link
Member

vrothberg commented Jan 29, 2025

Detected in the Image Mode pipelines which cannot use v1.38.1 for now.

https://gitlab.com/redhat/rhel/bifrost/compose-bootc/-/merge_requests/208/diffs

@vrothberg
Copy link
Member

Another MR only bumping buildah fails as well.

@vrothberg
Copy link
Member

@henrywang do we have an easy reproducer?

@cgwalters
Copy link

cgwalters commented Jan 29, 2025

I think this is a regression from 25a3b38

This feature is really key as a way to generate chunked/reproducible containers and we've been using it for quite some time for rpm-ostree, and we have plans in the immediate future to promote this more for custom base images (ref https://gitlab.com/fedora/bootc/tracker/-/issues/32 and coreos/rpm-ostree#5221 specifically)

Way back in coreos/rpm-ostree#4688 (comment) there was a soft commitment to support this.

My understanding is that there are security implications (not mentioned in the commit message) that motivated 25a3b38

Can we elaborate on those now that the commit is public (I don't see a corresponding PR with discussion)?

It seems to me that what we always wanted here is a way to write not to the build context, but just any internal filesystem (a tmpfs or a tempdir bind mount) whose lifetime is scoped to the entire build, right?

@mheon
Copy link
Member

mheon commented Jan 29, 2025

The relevant CVE is GHSA-5vpc-35f4-r8w6

A core part of this was the ability to write to the host filesystem from mounts, which is why it was disabled - so what you desire is a large part of a high-severity CVE. I don't see us turning this back on. Adding an option with the understanding that enabling it is explicitly insecure and exposes the host to potential escape by malicious Dockerfile is a potential option, but that also feels very bad.

@cgwalters
Copy link

@henrywang do we have an easy reproducer?

Here's a trivial example of a "build" that actually just copies from a registry into a local oci: directory via skopeo - but the point is that an "oci" directory is something that any tool running in the container build can synthesize however it wants - without e.g. being subject to injected floating-timestamp content from buildah itself etc. but also more generally having total power over structuring the layout of the target container image.

FROM registry.access.redhat.com/ubi9/ubi:latest  as builder
RUN --mount=type=bind,rw=true,src=.,dst=/buildcontext,bind-propagation=shared \
  dnf -y install skopeo && skopeo copy docker://busybox oci:/buildcontext/out.oci

FROM oci:./out.oci
# Need to reference builder here to force ordering. But since we have to run
# something anyway, we might as well cleanup after ourselves.
RUN --mount=type=bind,from=builder,src=.,target=/var/tmp \
    --mount=type=bind,rw=true,src=.,dst=/buildcontext,bind-propagation=shared \
      rm /buildcontext/out.oci -rf

@cgwalters
Copy link

Adding an option with the understanding that enabling it is explicitly insecure and exposes the host to potential escape by malicious Dockerfile is a potential option, but that also feels very bad.

Yes, agree adding an option isn't what we want here. Let's instead dig more into the final paragraph of my previous comment:

It seems to me that what we always wanted here is a way to write not to the build context, but just any internal filesystem (a tmpfs or a tempdir bind mount) whose lifetime is scoped to the entire build, right?

To flesh this out slightly...maybe we could use RUN --mount=type=cache for this since it's already defined to be a writable, persistent-across-run-invocations area. IOW we change to:

FROM registry.access.redhat.com/ubi9/ubi:latest  as builder
RUN --mount=type=cache,sharing=private,dst=/output \
  dnf -y install skopeo && skopeo copy docker://busybox oci:/output/out.oci

# Here I made up the syntax //<cachename>/ to reference a cache
FROM oci://output/out.oci

Note if we take this step and formalize things more it actually gets a lot cleaner as the builder can know the ordering here and we can declare that the cached OCI directory is automatically consumed (since why would you leave it around)?

@nalind
Copy link
Member

nalind commented Jan 29, 2025

Note if we take this step and formalize things more it actually gets a lot cleaner as the builder can know the ordering here and we can declare that the cached OCI directory is automatically consumed (since why would you leave it around)?

Caches are not scoped to the lifetime of a given build.

@cgwalters
Copy link

Caches are not scoped to the lifetime of a given build.

Yes I know - its lifetime is longer - which isn't needed here, but should still work. In my proposal you'd just end up with an empty directory/image as a cache; which actually we could detect that's the case and just discard.

Do you have any alternative proposals?

@cgwalters
Copy link

Maybe another possible fix would be to expand the lifetime of the overlayfs created from a RUN --mount=type=bind,rw=true,src=.,dst=/buildcontext,bind-propagation=shared to the entire build, instead of each RUN invocation?

I guess though a downside of this might be that it's different from what Docker is doing today, which could confuse people.

But at the cost of getting more hacky, we could do that scope expansion only if we detect there's a FROM oci: in the build?

@cgwalters
Copy link

I should emphasize here though again: our production builds use this feature today. Having a working version of this (and really ideally compatible with the existing Containerfile, although that wouldn't be a requirement) as relatively soon as possible (e.g. < 1 month) would be quite helpful.

I think in the short term we can just hold off on applying the buildah update, but that's obviously not sustainable. Among other things, a big part of the entire point of this is that we can tell people the base image build is just "podman build" today, and after updating that no longer works.

@cgwalters
Copy link

cgwalters commented Jan 29, 2025

Maybe another possible fix would be to expand the lifetime of the overlayfs created from a RUN --mount=type=bind,rw=true,src=.,dst=/buildcontext,bind-propagation=shared to the entire build, instead of each RUN invocation?

It looks like bind-propagatation is a buildah-specific extension not supported by Docker; so perhaps we could also key off that as a lifetime extension hint? Though when I test things out, a build works without it - so either it was never required or something else changed in buildah.

I guess the more I think about the more I'd say:

  • Short term, detect FROM oci in the image and lifetime expand any RUN --mount=type=bind,rw=true,src=.,dst=/buildcontext,bind-propagation=shared to the entire build (mounts identified by source+destination pair)
  • Medium term, try to come up with a design that's cleaner and can ideally be added to docker (and other builders) as well, for which I think I like the FROM oci://output/out.oci or FROM --mount=type=cache,target=/output oci:/output/out.oci or so.

@nalind
Copy link
Member

nalind commented Jan 29, 2025

I would have (and have previously) suggested solving this at the pipeline level.

Anyway, the reason the FROM fails is that the OCI archive is no longer being written to the actual build context directory by the previous stage, so it's literally not where the FROM instruction says it is. Unless it's a remote build being done via podman, a -v flag at the command line can be used instead of a --mount=type=bind in the Containerfile to expose the build context directory to RUN instructions.

When we teach the --build-context command line flag about OCI layouts (and possibly archives), the value on the FROM line will be able to be replaced by the nickname given as part of the flag's argument. That'll back us off just a bit further away from having custom behavior in Containerfiles, which is a direction I've come to generally try to discourage because the target we're trying to be compatible with moves.

@nalind
Copy link
Member

nalind commented Jan 29, 2025

It looks like bind-propagatation is a buildah-specific extension not supported by Docker

It looks like that flag is there because the set of accepted flags for buildah run --mount was modeled after podman run's --mount flag, and the logic for parsing its key-value arguments list was reused when it was later implemented for RUN in Containerfiles.

@cgwalters
Copy link

I would have (and have previously) suggested solving this at the pipeline level.

Can you elaborate on that? Previously in this comment you said:

and I'm not sure we want to sign up for that when the standard advice for jobs that can't be done with a Dockerfile is to use buildah's other commands.

In what we're doing here we aren't using buildah in any way because we want fine-grained control over the bit for bit output of the OCI archive.

But your point more generally here I would phrase as "don't try to do it via Containerfile" - which definitely makes sense as a starting point, except I'd just reiterate that going from "podman build" to "run this script/Makefile" is a large leap.

Actually today e.g. on MacOS (or more generally with podman-machine) this trick doesn't work; I didn't dig in but I think it has to do with us always copying the build context with podman-remote. But in theory, it could work on podman-remote/podman-machine cases, which is quite important because there's lots of infrastructure built on top of "podman-build". If we have a custom script/pipeline, then especially in podman-remote cases we'd have to do some careful juggling to make sure our build works there.

Hmm...I guess in theory we could implement our build process as "spawn a container image with write access to containers-storage:" (or if one wants to isolate it more, "spawn a container imgae with write access to an empty tempdir, which at the end of the build should contain an oci directory with one image").

But we just went through a large effort to switch our Konflux builds over to the "stock" buildah task they expose. Reworking everywhere we do a build, from the documented local developer entrypoint to Konflux would not be a small thing for us.

It's big enough that if I had to I'd rather spend that engineering time on implementing a more sustainable way to do this in buildah myself.

@cgwalters
Copy link

When we teach the --build-context command line flag about OCI layouts (and possibly archives), the value on the FROM line will be able to be replaced by the nickname given as part of the flag's argument.

Hmm...but here we're generating an entire OCI layout inside a build stage, we can't provide it as a build context at the start of a build.

@nalind
Copy link
Member

nalind commented Jan 29, 2025

I would have (and have previously) suggested solving this at the pipeline level.

Can you elaborate on that? Previously in this comment you said:

and I'm not sure we want to sign up for that when the standard advice for jobs that can't be done with a Dockerfile is to use buildah's other commands.

In what we're doing here we aren't using buildah in any way because we want fine-grained control over the bit for bit output of the OCI archive.

I'm pretty sure that at that point I was referring to "FROM oci-archive:", and if you're feeling charitable, depending on it.

But your point more generally here I would phrase as "don't try to do it via Containerfile" - which definitely makes sense as a starting point, except I'd just reiterate that going from "podman build" to "run this script/Makefile" is a large leap.

Agreed, it's not as simple a story to tell.

Actually today e.g. on MacOS (or more generally with podman-machine) this trick doesn't work; I didn't dig in but I think it has to do with us always copying the build context with podman-remote. But in theory, it could work on podman-remote/podman-machine cases, which is quite important because there's lots of infrastructure built on top of "podman-build". If we have a custom script/pipeline, then especially in podman-remote cases we'd have to do some careful juggling to make sure our build works there.

Hmm...I guess in theory we could implement our build process as "spawn a container image with write access to containers-storage:" (or if one wants to isolate it more, "spawn a container imgae with write access to an empty tempdir, which at the end of the build should contain an oci directory with one image").

I have personally tripped over the overlay driver's mount_program option having to be specified, or not specified, more than I'd like to have, to the point that I tend to want to avoid sharing the storage space. That's before you start worrying about different storage drivers.

The -t flag is supposed to accept any valid containers-transports(5) value, if that's of help.

@cgwalters
Copy link

cgwalters commented Jan 29, 2025

Can you indicate level of support for the "short+medium term options" listed in this comment you have ranging from say "yes I'd implement" to "would take a patch" to "no"? Most especially the "short term" one of lifetime-extending only if we detect FROM oci:

Actually let me just repeat it here, and expand/elaborate:

Option lifetime-extend:

  • Detect FROM oci: in the container build and lifetime expand the overlayfs created by any RUN --mount=type=bind,rw=true,src=<local path> to the entire build
  • Though now that I think about this more since I am pretty sure that the FROM oci: part was actually relying on being able to pick up the OCI archive from the host/default mount namespace by default, we may need to add some special casing here to say that the file path used for that is the last RUN --mount=type=bind,rw=true,src=. (note special casing of .?)

Advantages:

  • Compatibility with existing Containerfiles we use today

Option FROM oci://stage/path

In this the new syntax //stage/ as part of FROM oci://stage/path acts the same as a from= section of RUN --mount=type=cache. Instead of

Advantages:

  • Much clearer and less hacky in general IMO and I think we could try to get this into Docker too

Disadvantages:

  • Not compatible with the Containerfiles we use today, would be a bit of a pain to say "you need to update podman/buildah" but we could deal with it

@nalind
Copy link
Member

nalind commented Jan 29, 2025

Option lifetime-extend:

* Detect `FROM oci:` in the container build and lifetime expand the overlayfs created by any `RUN --mount=type=bind,rw=true,src=<local path>` to the entire build

* Though now that I think about this more since I am pretty sure that the `FROM oci:` part was actually relying on being able to pick up the OCI archive from the host/default mount namespace by default, we may need to add some special casing here to say that the file path used for that is the _last_ `RUN --mount=type=bind,rw=true,src=.` (note special casing of `.`?)

Advantages:

* Compatibility with existing Containerfiles we use today

Having just done the changes to make this an overlay, I can say that the number of places in the source code that this impacts would be high. I also don't relish creating more differences in behavior that a user has to remember and keep track of, so I would lean toward not doing this.

Option FROM oci://stage/path

In this the new syntax //stage/ as part of FROM oci://stage/path acts the same as a from= section of RUN --mount=type=cache. Instead of

Advantages:

* Much clearer and less hacky in general IMO and I think we could try to get this into Docker too 

Disadvantages:

* Not compatible with the Containerfiles we use today, would be a bit of a pain to say "you need to update podman/buildah" but we could deal with it

oci: already has a meaning in this context, and the syntax for it is incompatible with this, so I don't see us doing that.

@cgwalters
Copy link

oci: already has a meaning in this context, and the syntax for it is incompatible with this, so I don't see us doing that.

How would the syntax be incompatible? The leading :// seems to me to be sufficiently unlikely to occur for existing users. But if we wanted to avoid all potential for ambiguity, how about FROM --mount=type=cache,target=/output oci:/output/out.oci ?

@cgwalters
Copy link

Agreed, it's not as simple a story to tell.

It's more than a story, it's about the ecosystem. While there certainly are multiple build systems to build container images, Dockerfile is very, very popular as I think everyone here knows. And that means that actual infrastructure has been built on top of it.

At Red Hat for example with Konflux there's been an investment in a few things like "hermetic" builds where the builds can't fetch arbitrary things from the network, only a restricted set of inputs. Also, there's policies around which container images can be used in tasks.

Both of these things are today applied on top of Dockerfile/Containerfile (though there's also lockfiles for rpms and other source content needed). As soon as "build this project" becomes more freeform than that, it needs custom logic to match these types of policies.

Another related thing is there are some parts of Dockerfile that function perfectly fine and don't need reinventing, such as e.g. LABEL.

The fact that we were able to fit custom container images into the "shape" of a Dockerfile and again continue to reuse existing things like LABEL while producing reproducible tarballs in a more intelligent way possible than Dockerfile has been extremely useful.

@nalind
Copy link
Member

nalind commented Jan 30, 2025

oci: already has a meaning in this context, and the syntax for it is incompatible with this, so I don't see us doing that.

How would the syntax be incompatible? The leading :// seems to me to be sufficiently unlikely to occur for existing users.

There is no leading :// for oci:, as there isn't one for oci-archive:. The portion after that ":" is a filesystem path name, and that's a problem

But if we wanted to avoid all potential for ambiguity, how about FROM --mount=type=cache,target=/output oci:/output/out.oci ?

The ideal for me here is an approach that doesn't depend on implementing custom syntax, and that doesn't break when we fix compliance bugs.

@cgwalters
Copy link

doesn't depend on implementing custom syntax,

Right, it would clearly make sense to try to get this into Docker at some point, though last I looked at that the ecosystem there is around buildx, which basically a whole lot of people who want to build containers in a more intelligent way are using. But maybe they'd be amenable to the FROM --mount=type=cache or whatever. Anyways, that's not a short term thing.


One thing I realized in a conversation just now with the team is that actually because of this writable mount to the build context and some other reasons, we'd required the build to be run with --security-opt=label=disable --cap-add=all --device /dev/fuse already...which is basically --privileged. So the CVE isolation doesn't apply to this current use case anyways.

So here's yet another, hopefully more targeted short term proposal:

Change buildah to detect when the input build is privileged in this way (CAP_SYS_ADMIN in the outer userns, which I am sure could be leveraged into a full breakout); if it's not, no behavior change from current git.

  • extend the lifetime of the most recent host bind mount that references . to the end of the build
  • change FROM oci/FROM oci-archive to in the case where the image referenced is not found in the host mountns, look in that most recent overlayfs

or so

@jlebon
Copy link
Contributor

jlebon commented Jan 31, 2025

For the longer term RFE, one idea I had in coreos/rpm-ostree#4688 (comment) was to add the concept of "builder images". This is not novel of course; it's similar to OpenShift S2I and Buildpaks. But having it built into buildah would generalize the rpm-ostree case to something much more impactful for the ecosystem.

cgwalters added a commit to cgwalters/rpm-ostree that referenced this issue Jan 31, 2025
Since containers/buildah#5952 broke
the Containerfile flow, let's add a workflow (that we wanted
to support anyways) to operate outside of a container build.

Signed-off-by: Colin Walters <[email protected]>
cgwalters added a commit to cgwalters/rpm-ostree that referenced this issue Jan 31, 2025
Since containers/buildah#5952 broke
the Containerfile flow, let's add a workflow (that we wanted
to support anyways) to operate outside of a container build.

Signed-off-by: Colin Walters <[email protected]>
@cgwalters
Copy link

coreos/rpm-ostree#5268 will switch us over to more clearly supporting a flow that doesn't depend on this now removed feature.

@cgwalters
Copy link

And as I am trying to document/test that PR, for the record a pain point I'm hitting that I think is illuminating in that I was recommending doing -v /var/lib/containers:/var/lib/containers so we can write to the container storage, but that obviously falls over for rootless builds.

One thing I did in containers/bootc#919 that was some nontrivial code (that I can certainly rework to make usable here too) is dynamically mounting these things and not forcing the user to specify them. But even there I only needed to care about rootful; here it's a bit more complicated as really in the general case I want the code running in this container image to match the c/storage configuration of the host. I'm sure this has come up elsewhere?

I guess most use cases of "build from a container" actually end up with a c/storage instance scoped to that container, isolated from the host storage; which we obviously want to support too, but would be annoying for many local dev cases.

It's exactly this type of thing that we all just got For Free as being part of podman build.

@cgwalters cgwalters changed the title buildah 1.38.1 build "FROM oci-archive:" failed buildah 1.38.1 build "FROM oci-archive:" fails to find content generated inside a build Jan 31, 2025
@cgwalters
Copy link

cgwalters commented Jan 31, 2025

OK here's another proposal: Again since in this use case we already require privileges, we add a new buildah-overlay=false option to RUN --mount=type=bind,rw=true,src=.,dst=/buildcontext,buildah-overlay=false that is only available to privileged builds. This seems like it'd be pretty simple, and we namespace the option to avoid any potential clash with docker.

cgwalters added a commit to cgwalters/rpm-ostree that referenced this issue Jan 31, 2025
Since containers/buildah#5952 broke
the Containerfile flow, let's add a workflow (that we wanted
to support anyways) to operate outside of a container build.

Signed-off-by: Colin Walters <[email protected]>
@cgwalters
Copy link

For the longer term RFE, one idea I had in coreos/rpm-ostree#4688 (comment) was to add the concept of "builder images".

We should probably split off longer term things into a distinct issue but it would seem to me to be strictly better to have a model where the Containerfile can control this and it's not something that needs to be passed via a new CLI argument. I see my proposal above of FROM oci://stage/path (or some variant) of that as strictly more powerful basically because the "builder" image here can just be another FROM argument.

@dustymabe
Copy link

better to have a model where the Containerfile can control this and it's not something that needs to be passed via a new CLI argument

👍

@jlebon
Copy link
Contributor

jlebon commented Feb 3, 2025

For the longer term RFE, one idea I had in coreos/rpm-ostree#4688 (comment) was to add the concept of "builder images".

We should probably split off longer term things into a distinct issue but it would seem to me to be strictly better to have a model where the Containerfile can control this and it's not something that needs to be passed via a new CLI argument. I see my proposal above of FROM oci://stage/path (or some variant) of that as strictly more powerful basically because the "builder" image here can just be another FROM argument.

Yes, for sure. But the point with the build models like S2I and Buildpaks is that you decouple your app code from the buildroot so they can be maintained independently/possibly by different organizations. So you wouldn't even have a Containerfile in your git repo. Obviously there are tradeoffs with that too.

Where it's useful for our case specifically is that there's less cargo-culting that needs to be done to get this working right. We can e.g. just implement the magic FROM oci: equivalent once on the builder image side. But also it makes it easier to build declarative interfaces on top.

Of course, having built-in Containerfile support is not mutually exclusive with the concept.

@vrothberg
Copy link
Member

OK here's another proposal: Again since in this use case we already require privileges, we add a new buildah-overlay=false option to RUN --mount=type=bind,rw=true,src=.,dst=/buildcontext,buildah-overlay=false that is only available to privileged builds. This seems like it'd be pretty simple, and we namespace the option to avoid any potential clash with docker.

I am in favor of making the new security guard configurable. The CVE fix is valid but currently there is no alternative to the previous behavior. It can hit any user. So +1 from my end on the direction. I refrain from suggesting a better name :^ )

@nalind, how do you feel about it?

@nalind
Copy link
Member

nalind commented Feb 4, 2025

I think that the ability to bypass "shouldn't be allowed to write to the build context" isn't the kind of thing we're supposed to be trusting a Containerfile's author on.

@vrothberg
Copy link
Member

I think that the ability to bypass "shouldn't be allowed to write to the build context" isn't the kind of thing we're supposed to be trusting a Containerfile's author on.

Very fair point. If it had to be explicitly allowed by the build environment/builder/admin.

@dustymabe
Copy link

dustymabe commented Feb 4, 2025

I think that the ability to bypass "shouldn't be allowed to write to the build context" isn't the kind of thing we're supposed to be trusting a Containerfile's author on.

I actually don't think the feature we desire (and were previously using) requires writing to the build context.

IIUC all we really want to do is be able to take a file or directory generated in an earlier stage of a multi-stage build and slurp that into the FROM so that the output of the overall container build from the Containerfile is the image that was generated in that earlier stage.

Writing out that file to the build context and then using it in the form of FROM: oci-archive: worked nicely, but we could (I believe) make it work without writing to the build context, and thus not bumping into the security issue.

I think Colin suggested something like this earlier:

FROM oci://stage/path

I guess an example here would be something like:

FROM quay.io/fedora/fedora-bootc:rawhide as rootfs
RUN dnf -y install htop
RUN bootc container lint

FROM quay.io/fedora/fedora-bootc:rawhide as builder
RUN --mount=from=rootfs,dst=/rootfs <<EORUN
rpm-ostree experimental compose build-chunked-oci --bootc --format-version=1 \
           --rootfs=/rootfs --output /output/out.oci
EORUN

FROM oci://builder/output/out.oci
LABEL empty # not sure if this would be required or not

Alernatively, something like this could work:

FROM --mount=from=builder,src=/output/,dst=/output/ oci:/output/out.oci
LABEL empty # not sure if this would be required or not

I think it's not too important the syntax (you container experts are probably better equipped to know what syntax would be ideal here). What is important (to us at least) is the use case.

I know what we're describing here is a specialized use case. Maybe it's worth a new issue/RFE?

@cgwalters
Copy link

I think that the ability to bypass "shouldn't be allowed to write to the build context" isn't the kind of thing we're supposed to be trusting a Containerfile's author on.

I think I addressed this in my comment above - again we already require privileges (for other reasons) for this build type, so gating this capability on e.g. CAP_SYS_ADMIN from the outer namespace would be fine. Hmm actually...when talking about capabilities here of course the user namespace is important. It looks like (unsurprisingly) podman build as root mirrors podman run and does not allocate a userns by default. That'd be good to evangelize changing as it adds a lot of security.

(But yes of course at a technical level there's no reason this should require privileges, but we're just hung up on a syntax to tell the FROM stage to operate not on the outer mountns)


Anyways...I have to say on one hand it's good that this happened now and not months from now, that would have been dramatically more painful. And supporting this outside of a Podman-specific Containerfile feature is something we had to do regardless.


While we are doing that work to decouple, bottom line though it'd still be really, really useful to have a short term change to re-enable (again gated on privileges) and e.g. emitted a warning "this feature may be removed in a year" or something.

@dustymabe
Copy link

dustymabe commented Feb 4, 2025

again we already require privileges (for other reasons) for this build type

I may be missing something; but narrowing the scope of this exact issue/RFE would be useful.

What's not clear to me is if the feature we desire HERE actually requires privileges. The workflow I describe in my previous comment shouldn't.

I guess another way to write the example (to remove the tooling from the equation, to try to add clarity):

FROM quay.io/fedora/fedora:rawhide as curler
RUN curl https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/41.20250117.3.0/x86_64/fedora-coreos-41.20250117.3.0-ostree.x86_64.ociarchive > /output/out.ociarchive

FROM oci-archive://curler/output/out.ociarchive
LABEL empty # not sure if this would be required or not

In this scenario, I don't see the need for privilege at all. Am I missing something?

@nalind nalind added the jira Issues which will be sync'd to a card at https://issues.redhat.com/projects/RUN label Feb 4, 2025
@nalind
Copy link
Member

nalind commented Feb 4, 2025

I don't think suggestions that require changing the details of how we interpret containers-transports(5) locations are going to be part of the solution here, particularly when we've already established that the current acceptance of them has to have a limited shelf life.

@dustymabe
Copy link

I don't think suggestions that require changing the details of how we interpret containers-transports(5) locations are going to be part of the solution here

OK. That's acceptable, but I feel like one of the options from #5952 (comment) may not invalidate that?

FROM --mount=from=builder,src=/output/,dst=/output/ oci:/output/out.oci
LABEL empty # not sure if this would be required or not

i.e. no new container-transports parsing required in that example, but support for a way to mount in the output from a previous stage.

particularly when we've already established that the current acceptance of them has to have a limited shelf life.

hmm. I missed that part. Are you implying that FROM won't work with containers-transports(5) described locations in the future? i.e. FROM oci-archive:./foo.ociarchive even if it's not from a previous build stage?

@nalind
Copy link
Member

nalind commented Feb 5, 2025

They really shouldn't, no. Pathnames don't really make sense in remote builds.

@nalind
Copy link
Member

nalind commented Feb 6, 2025

Maybe another possible fix would be to expand the lifetime of the overlayfs created from a RUN --mount=type=bind,rw=true,src=.,dst=/buildcontext,bind-propagation=shared to the entire build, instead of each RUN invocation?

To do that, when we have to allow for multiple users simultaneously, some read-only, any writing, we have to mount the overlay over the build context directory before any stage accesses the directory tree in any way, so that ADD, COPY, and RUN instructions that read from the location reflect changes that have been made by RUN instructions that modify that location in different stages that could be running concurrently.

Fixing up the handling of paths in certain FROM values is still necessary because "." is the directory from which we were invoked, and that's not where data being "written" to the build context will be, but there's still potential for problems there.

#5975 tries to get this workflow going again.

@dustymabe
Copy link

IIUC #5975 deals with the build context directory, which is how we were using this before, but taking a step back and looking at what we want to do wouldn't being able to FROM a file from a previous stage be a better solution? Leveraging the build context was just a vehicle for what we wanted to do, but maybe not the best solution.

FROM quay.io/fedora/fedora:rawhide as curler
RUN curl https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/41.20250117.3.0/x86_64/fedora-coreos-41.20250117.3.0-ostree.x86_64.ociarchive > /output/out.ociarchive

FROM --mount=from=curler,src=/output/,dst=/output/ oci-archive:/output/out.ociarchive
LABEL empty # not sure if this would be required or not

@cgwalters
Copy link

wouldn't being able to FROM a file from a previous stage be a better solution?

I think everyone agrees it would, I proposed something similar above. But the concern here is that it requires new syntax which doesn't exist in Dockerfile AIUI, see this comment and the one below.

@yuravk
Copy link

yuravk commented Feb 12, 2025

AlmaLinux is facing the same issue since podman version 5.2.2-13 (Buildah v1.37.6) for our BootC images pipeline:
https://github.com/AlmaLinux/bootc-images/actions/runs/13255082776/job/37000516781

Generating container image...done
Pushed digest: sha256:1a6dc60447f117b9ba01f55afac44bee6eceae31570ef52f3234bbd1b1b4b232
Wrote: oci-archive:/buildcontext/out.ociarchive
--> b120be48e54e
[3/3] STEP 1/2: FROM oci-archive:./out.ociarchive
Error: creating build container: creating temp directory: archive file not found: "/actions-runner/_work/bootc-images/bootc-images/out.ociarchive"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira Issues which will be sync'd to a card at https://issues.redhat.com/projects/RUN
Projects
None yet
Development

No branches or pull requests

8 participants