[OKD-SCOS v4.16] UPI install with agent installer or assisted installer fails for 4.16.0-0.okd-scos-2024-08-01-132038 #2015
Replies: 4 comments
-
I've just checked that, agent-based installer is not a part of docs.okd.io, you can find it on docs.openshift.com. May be this method is not supported by OKD! |
Beta Was this translation helpful? Give feedback.
-
Follow-upAlso fails with the Assisted Installer:The installation fails with the Assisted Installer in the same way. Not a big surprise as the assisted installer uses the agent installer under the hood
Workaround for a success with Assisted Installer:The installation works when replacing the embedded FCOS bootstrap OS by RHCOS:
Workaround for a success with Agent Installer (ABI):Overriding the bootstrap OS image with a RHCOS image make the installation a success also when installing OKD via ABI by using the following when building the install image:
I did not choose a random OS bootstrap OS image, this is the one for v4.16 specified for an OCP installation via the ABI as specified here: https://github.com/openshift/assisted-service/blob/d3324b06a7c7772f4619c3ab13dd8c0706e55fd9/deploy/podman/configmap.yml#L25 Q:
|
Beta Was this translation helpful? Give feedback.
-
Hi, I have encountered the exact same issue. Overriding the bootstrap OS image with a RHCOS image as proposed is not working for me as my lab hardware is using an LSI hardware (Dell Perc H310 using an IT firmware) that is not recognize by the RHCOS image (the mpt3sas driver on RHCOS has disabled the support for this driver). Also the other alternative to use openshift-install-linux-4.15.0-0.okd-scos-2024-01-18-223523 do not work for me : the rendezvous host fails to start the bootstrap with an error 'pull secret for new cluster is invalid: pull secret must contain auth for "registry.ci.openshift.org"'. @titou10titou10, shouldn't you open an issue about this problem ? Best Regards, Alain |
Beta Was this translation helpful? Give feedback.
-
I reproduced this problem with 4.16.0-0.okd-scos-2024-09-27-110344 image on Dell PowerEdge R740. I also verified the proposed RHCOS workaround. That worked fine for me. |
Beta Was this translation helpful? Give feedback.
-
Context
Trying to install a SNO (Single Node) cluster:
It is important to note that the install works perfectly well with the exact same agent and install config files for
I also have tried for a multi-node cluster, it fails the same way
Summary
It fails with the following error from the "release-image-pivot" service:
Details
install-config.yaml:
agent-config.yaml
The install fails after a few minutes the node boots, but the process fails, looping forever
"kubelet" service:
"release-image-pivot" service:
So the "release-image-pivot" fails to start because this problem?:
Other (pertinent?) info:
approve-csr.service
podman
Message from the installer:
Beta Was this translation helpful? Give feedback.
All reactions