-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COS-2611: Use deploy-via-container #657
COS-2611: Use deploy-via-container #657
Conversation
Skipping CI for Draft Pull Request. |
/test all |
Experimenting with ostree-containers in OCP/RHCOS8First, https://copr.fedorainfracloud.org/coprs/g/CoreOS/rpm-ostree-rhel8/ contains binary RPMs for RHEL8. Using that with this PR, I've run New format container:
Old (current) format container (archive ostree + extensions RPMs):
OCP release image:registry.ci.openshift.org/coreos/walters-ocp410-ostreecontainer@sha256:8b0b769a8993e0c3e3bef11805644fb6ce89927473ed8640825a58f7091c708d Things to try
will flip your machine over into the new format container image. Generating a derived build in-clusterI pushed an rhcos branch of my fcos-derivation-example. I then did this:
Actually pulling the imageThe first problem you'll notice here is that the registry is only on the pod network, not the host network. I chose to expose the registry to make it easier to fetch content from the host network. However, clearly we need to make it easy to fetch content from the pod network into the host network via a proxy - should we have rpm-ostree-on-RHCOS know how to automatically invoke Allow anonymous pullsThe default kubelet pull secret does not have credentials for the in-cluster registry. (TODO: figure out how that works) Add the cluster service CA to the host trust rootI think I took the content of the Result
|
I am not too familiar with the kubelet/CRI/cri-o here - how does it work today for the kubelet to pull an image from the internal registry? How is the authentication handled? I found this code and may start tracing from there. But basically I think what we want here is an easy way for the host update process to get the credentials to pull an image in the same way the kubelet is doing. (But not actually pull the image into the container storage) |
@cgwalters Here are some docs for how kubelet could populate the pull secret:
CRI-O also has a |
@mrunalp How does it work today for the kubelet to pull images from a cluster-internal registry? So crio is on the host network, and trying out
Yet, clearly it's resolving that DNS name somehow. I've been looking in the crio code and it's not obvious...oh wait, does it have something to do with |
OK moving this to ostreedev/ostree-rs-ext#121 |
Not quite, SeparatePullCgroup is for isolating the memory used to pull large images away from CRI-O. what you're looking for is the image destination resolution code, which lives at https://github.com/cri-o/cri-o/blob/main/internal/storage/image.go#L395. As far as translating between a registry URL and an actual IP/port to pull from, I am not as familiar. I usually pull in @mtrmac or @vrothberg for such questions |
WRT the low-level pull implementation: There is a bit of mapping for At least https://github.com/openshift/cluster-image-registry-operator/blob/master/manifests/0000_90_cluster-image-registry-operator_01_operand-servicemonitor.yaml does suggest that the quoted name should be resolvable within the cluster’s pods. So this should be a ~general Kubernetes DNS question, and I’m afraid I don’t know the details of that. |
Right - that's straightforward. But here I want to pull from the host network namespace. So far at least in OpenShift we have explicitly not had the host depend on or use cluster DNS, because it creates a dependency cycle - the host needs to use DNS to pull the pods that serve DNS in-cluster, etc. There is some thought that we should have the host optionally use cluster DNS only when it's up, and have a health check that resets the configuration back if e.g. the cluster DNS pods are being upgraded, etc. Perhaps...I basically need to copy the code from containers/skopeo#1476 into crio too, and have the MCO do something like "if kubelet is up, inject configuration into rpm-ostreed.service to tell it to fork off a crio binary instead"? Or, perhaps even simpler, add But, I'd like to avoid duplicating that code... OK since no one seems to know offhand how this is working I'll try to dig in, and if it's easy to replicate outside of crio (maybe e.g. kubelet is doing a dns lookup in the pod network and passing that IP down to crio via a backchannel?) I'll do that, otherwise fall back to patching crio. |
fbfcfa8
to
8783c69
Compare
8783c69
to
acceb6b
Compare
OK https://copr.fedorainfracloud.org/coprs/g/CoreOS/rpm-ostree-rhel8 is updated with the latest skopeo+rpm-ostree changes. Previously it was trying to target both Fedora and RHEL8 (via C8S) but since the code is in Fedora I've now changed it to just target RHEL8. But even doing that suddenly required switching from c8s to epel8 because https://git.centos.org/rpms/json-c/c/aa505d489ccc4ad2e2abfcc61b08b8f8b272c4f4?branch=c8s broke binary backcompat (and libdnf links to json-c). I also made a few fixes to ensure the stack runs on rhel8, which are rolled in to coreos/rpm-ostree#3249 Next step: update release image etc. (EDIT: DONE) See #657 (comment) for new status. |
acceb6b
to
6cdf9af
Compare
6cdf9af
to
4ec18c4
Compare
First, today I randomly stumbled across https://docs.openshift.com/container-platform/4.9/registry/accessing-the-registry.html#registry-accessing-directly_accessing-the-registry Which is really exactly about this issue. And when I tried it, I was extremely confused because indeed After some more digging, I finally figured it out:
Something is special casing the registry in This misled me to somehow think some magic with network namespaces was involved, because I then didn't even try using e.g. Also, hooray for the cookie And now I discover yet another chunk of load-bearing bash script run as root. |
This changes the CI configuration for the `mcbs` branch to use openshift/os#657
OK now I'm banging my head against image pull authentication again. I would swear previously I did
as root to get a What we really need to drill into here is basically re-using the kubelet credentials I think. |
4ec18c4
to
e26580c
Compare
OK, we have a green for RHEL now. However, this one could still use some more manual testing |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
I was asked about the status of this issue; I think it's likely to be fixed in the MCO now, but the simplest way to test is basically:
If it errors out with just "unexpected osimageurl", that's basically OK. If it errors in a more fatal way, then the osimageurl parsing still needs work most likely. The most real test is to scale up a new worker node (or a whole cluster) from a disk image with this change.
This, however is still an issue. |
|
Yeah that's the error I'd expect; which is generally good. It means we didn't fail to parse it at least. |
Note this change has no effect really on the generated container image, only on disk images. |
@cgwalters any reason not to let CI run on this and merge if all looks good then? I guess the pull spec looking as a directory is not that awful. Not sure if we are tracking official pull specs anywhere else. |
Unfortunately CI in this repository still doesn't cover running OpenShift, which is where this change would matter in general. Most of our kola unit tests don't care. |
/retest
so this is what still needs to be tested I think. It's been a while since I've done this but @jmarrero I think you can you create an image in GCP (or an AMI in AWS if that's easier) and then modify the machine pool of a test cluster in that cloud to specify a different AMI ID or GCP image name for workers to start, then scale it up. |
Tested on a cluster by:
also changed the ami name a few chars to make sure it is really using the one I uploaded. |
Thanks @jmarrero - also coreos/coreos-assembler#3700 just merged which should give us a less hacky looking imgref when running I'd say this is unblocked now? |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters, jmarrero The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retitle COS-2611: Use deploy-via-container |
@cgwalters: This pull request references COS-2611 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.16.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/hold cancel |
@cgwalters: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest |
Use
deploy-via-container
This will cause us to run through the ostree-native container
stack when generating the disk images.
Today for RHCOS we're using the "custom origin" stuff which
lets us inject metadata about the built source, but rpm-ostree
doesn't understand it.
With this in the future (particularly after coreos/coreos-assembler#2685)
rpm-ostree status
will show the booted container and understand it;for example in theory we could have
rpm-ostree upgrade
work (ifwe provisioned from a tag instead of a digest).
A side benefit is this should fix coreos/coreos-assembler#2685
because we're now pulling a single large file over 9p instead of lots
of little ones.