-
Describe the bug Clevis can no longer decrypt the root partition with a Tang server after an OKD update (from 4.7.0-0.okd-2021-06-19-191547 to 4.7.0-0.okd-2021-08-07-063045). The message And when I connect to the nodes via SSH, I see a message telling me that the Version
Additional information Here is the MachineConfig template I use to encrypt my nodes : apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
name: 90-{{ NODE_ROLE }}-tang
labels:
machineconfiguration.openshift.io/role: {{ NODE_ROLE }}
spec:
config:
ignition:
version: 3.2.0
storage:
disks:
- device: /dev/sda
partitions:
- label: root
number: 4
sizeMiB: 0
resize: true
luks:
- name: root
device: /dev/disk/by-partlabel/root
label: luks-root
clevis:
tang:
- url: http://{{ SERVICES_VM_IP }}:7500
thumbprint: {{ TANG_THUMBPRINT }}
keyFile:
source: data:,{{ LUKS_PASSPHRASE }}
options: [--cipher, aes-cbc-essiv:sha256]
wipeVolume: true
filesystems:
- device: /dev/mapper/root
format: xfs
wipeFilesystem: true
label: root
kernelArguments:
- rd.neednet=1 Update to 4.7.0-0.okd-2021-08-07-063045 also upgraded some packages :
The initramfs (after the update) contains the following dracut modules :
I think the bug is due to NetworkManager packages upgrades because when I rollback to the previous version of the machine OS, everything works as it used to : [core@okd4-control-plane-0 ~]$ sudo rpm-ostree status
State: idle
Deployments:
● pivot://quay.io/openshift/okd-content@sha256:c4b29959c87a1632923e5f30aa211f6439c652c8ab9af94d55b9f40b458b48f7
CustomOrigin: Managed by machine-config-operator
Version: 47.34.202107301811-0 (2021-07-30T18:14:58Z)
pivot://quay.io/openshift/okd-content@sha256:ce3e69194860476064a7a36075a9de3b46b2eaf6851ceedae41a69e5124a3637
CustomOrigin: Managed by machine-config-operator
Version: 47.34.202106191111-0 (2021-06-19T11:14:24Z)
[core@okd4-control-plane-0 ~]$ sudo rpm-ostree rollback I also noticed that I could not add new nodes because when I send a request to the MCO endpoint (e.g.
I don't know why the MCO wants to convert the config to Ignition v2.2.0. How can I solve this problem ? Note that I tried to update to the latest release (4.7.0-0.okd-2021-08-22-163618) but I am still facing this issue. How reproducible
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
@dustymabe could you help us with Tang issue?
You need to include Ignition version |
Beta Was this translation helpful? Give feedback.
-
I also deploy my nodes using the Here is the Butane template that I use : variant: fcos
version: 1.3.0
systemd:
units:
- name: install.service
enabled: true
contents: |
[Unit]
Description=Run CoreOS Installer
Requires=coreos-installer-pre.target
After=coreos-installer-pre.target
OnFailure=emergency.target
OnFailureJobMode=replace-irreversibly
After=network-online.target
Wants=network-online.target
[Service]
Type=oneshot
ExecStart=/usr/bin/coreos-installer install /dev/sda --insecure-ignition \
--ignition-url http://{{ WEB_SERVER_IP }}:8080/okd4/worker.ign \
--image-url http://{{ WEB_SERVER_IP }}:8080/okd4/fcos.raw.xz \
--append-karg ip={{ NODE_IP1 }}::{{ DEFAULT_GATEWAY }}:255.255.255.0:{{ NODE_FQDN }}:{{ NET_DEVICE }}:none \
--append-karg ip={{ NODE_IP2 }}:::255.255.255.0::{{ NET_DEVICE }}:none \
--append-karg nameserver={{ DNS1_IP }} \
--append-karg nameserver={{ DNS2_IP }} \
--append-karg nameserver={{ DNS3_IP }}
ExecStart=/usr/bin/systemctl --no-block reboot
StandardOutput=kmsg+console
StandardError=kmsg+console
[Install]
RequiredBy=default.target Then when I start the nodes, I type the following kernel argument :
Does this way of deploying my nodes work in newer versions of FCOS? |
Beta Was this translation helpful? Give feedback.
coreos/fedora-coreos-tracker#943