Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SURE-6536] Cluster labels cannot be removed, updated or modified #9563

Open
gaktive opened this issue Jul 21, 2023 · 19 comments
Open

[SURE-6536] Cluster labels cannot be removed, updated or modified #9563

gaktive opened this issue Jul 21, 2023 · 19 comments

Comments

@gaktive
Copy link
Member

gaktive commented Jul 21, 2023

Internal reference: SURE-6536
Reported in 2.7.3 & 2.7.4

Issue description:
Cluster labels are deleted or overwritten when modified or added.

Business impact:
Fleet operates by targeting cluster groups however clusters cannot be consistently added or removed from groups because the label changes get overwritten by Rancher.

Troubleshooting steps:

  • View cluster labels
  • Edit the cluster labels
  • Save changes
  • Wait a moment (sometimes the change takes effect only to be overwritten shortly thereafter) - in fleet this causes workload flapping.
  • View cluster labels - note the change did not stick

Repro steps:

  • Create a downstream cluster - import k3s e.g.
  • Add a label during cluster creation or afterward.
  • Try to remove or modify the label

Workaround:
None

Actual behavior:
Cluster labels are reverted to the state before the change is made.

Expected behavior:
Cluster label changes according to the edit

Additional notes:
An internal person was able to reproduce this with a k3s cluster created via the node driver and also with imported clusters, though they have not tried other cluster types.

A fix was done for 2.7.5 via rancher/ui#4975 but from that person's tests:

My tests included an imported k3s cluster and an aws node driver cluster. I did not test with Elemental clusters.

Imported k3s cluster

  • Labels can be added to a cluster at the time of cluster creation [SUCCESS]
  • Labels can be added to a cluster after cluster is provisioned [SUCCESS]
  • Labels can be deleted from a cluster if they were added after the cluster was provisioned [SUCCESS]
    - Labels cannot be deleted from a cluster if the label was added during cluster creation [FAIL]
  • Labels can be edited in a cluster if the the label was added after the cluster was provisioned [SUCCESS]
    - Labels cannot be edited if the label was added during cluster creation [FAIL]

Hostbusters (@snasovich) offered some guidance from the backend standpoint and it was determined that the user tried this example:

I create a couple labels during the process of cluster creation "Import -> Labels and Annotations -> Add Label"

geo: "ny"
managed: "yes"

If I try to delete these labels, doing "Edit Config" on the cluster from the "Manage Clusters" page, it appears to delete from the UI but after saving the form, the Label is still there.

Also, if I "Edit Config" and try to modify the value of the label.

managed: "no"

for example, then it appears to change the value on the form but in fact the value is not changed when I save the form. Go back in to view it - go back to config view or edit yaml to see it is unchanged.

By the way, that is all via the "Manage Clusters" page.

I just tested the same operations via "Continuous Delivery" and on the surface it appears to work but if you go back in to edit configs, you can see that it did not work. It does not help that these pages show different (inaccurate) information.

@gaktive
Copy link
Member Author

gaktive commented Jul 24, 2023

First task is to reproduce this and confirm whether this is UI or if this points to backend.

@richard-cox
Copy link
Member

richard-cox commented Jul 26, 2023

I've given this a quick try with v2.7.5 and the latest ui code

  • Imported cluster - labels added during cluster create are shown in both the detail view and edit config view
  • Imported cluster - labels added when the cluster is edited do NOT show in the detail view but DO show in the edit config view
    • the steve provisioning cluster object does NOT contain the label added when editing the cluster
    • the norman cluster object DOES contain the label added when editing the cluster

The edit cluster process for imported clusters is in ember, which updates the norman cluster. It might be something with the sync process between norman and steve cluster objects

We should confirm if the above is true when using the UI from v2.7.5 as well

@gaktive
Copy link
Member Author

gaktive commented Jul 26, 2023

We'll confirm if this is backend or not.

@aalves08
Copy link
Member

Update @gaktive @richard-cox : just confirmed Richard's findings with a Rancher v2.7.5 + ui code v2.7.5.

@gaktive
Copy link
Member Author

gaktive commented Jul 28, 2023

Looks like backend after all based on the Steve to Norman talk. Will transfer ticket over and notify the SURE folks that this will not be in Q3.

@gaktive gaktive added the team/area2 Hostbusters label Jul 28, 2023
@gaktive gaktive transferred this issue from rancher/dashboard Jul 28, 2023
@gaktive
Copy link
Member Author

gaktive commented Jul 28, 2023

Transferred. Att'n @Sahota1225

@felipe-colussi
Copy link

felipe-colussi commented Aug 16, 2023

I did some research on it.

Rancher have two distinct objects that are used to manage and keep clusters information: cluster.provisioning.cattle.io (v1) and cluster.management.cattle.io (v3).

Clusters created from rancher using RKE2 and K3S as well as imported clusters are considered "clusters" wile clusters created from RKE and the local cluster are considered "legacy clusters".

The "normal clusters" are created using the cluster.provisioning.cattle.io (Steve - v1) object as base and replicate some data to the cluster.management.cattle.io (Norman - v3) to be used for other controllers. The cluster.provisioning.cattle.io (v1) object is the one responsible for maintaining the cluster information.

The imported clusters have the cluster.provisioning.cattle.io (v1) as their primary object, once this object or the cluster.management.cattle.io (v3) is changed rancher will do a sync that will reflect the values from the V1 object into the V3 one.

When "importing a cluster" the UI do a POST into {url}/v1/provisioning.cattle.io.clusters, which creates a V1 object, after that rancher creates the v3 object.

When updating the imported cluster the UI do a PUT into {url}/v3/clusters/{cluster-name}, that edits the cluster.management.cattle.io (v3), once that happens rancher will sync the objects, overwriting the data from the V3 object with the data presented on the V1 object, as expected.

I do believe that the "right" solution would be to change the UI to interact over a single object ( cluster.provisioning.cattle.io v1) instead of changing the behavior of rancher.

@richard-cox
Copy link
Member

The handling of imported clusters in the new UI is done via an embedded instance of the old UI. The old ui exclusively uses the old norman API and has no concept of the new steve api.

@snasovich
Copy link
Contributor

@felipe-colussi, thank you for the detailed explanation.
@richard-cox , as Felipe noted, the problem is inconsistency as creation is done via POST to v1/provisioning.cattle.io.clusters establishing provisioning cluster as the primary object and editing is done via PUT to v3/clusters/cluser-id?_replace=true modifying management cluster instead.

We'll discuss further within the team to see what can/should be done on backend side to remedy this.

@snasovich
Copy link
Contributor

@richard-cox @gaktive , we have further discussed it and since provisioning cattle object is the primary one for imported clusters (and is the one that is correctly created by UI) it should be the same object modified on Edit. Having 2 primary objects makes bi-directional sync unfeasible and will lead to sync conflicts.
Transferring back to dashboard.

@zube zube bot removed the [zube]: Working label Aug 17, 2023
@snasovich snasovich transferred this issue from rancher/rancher Aug 17, 2023
@gaktive gaktive added this to the v2.8.0 milestone Aug 17, 2023
@richard-cox
Copy link
Member

ok, i'm late to the party. This will be resolved via #9476

@gaktive gaktive added team/area1 Team Neo and removed team/area2 Hostbusters labels Nov 21, 2023
@gaktive
Copy link
Member Author

gaktive commented Dec 1, 2023

Backend should have unblocked this for UI and now we're blocking them. Moving to Next Up.

@gaktive
Copy link
Member Author

gaktive commented May 31, 2024

For those folks who run into this, there is a workaround: update the labels on clusters.provisioning.cattle.io objects via kubectl.

@gaktive gaktive modified the milestones: v2.9.next1, v2.10.0 May 31, 2024
@nwmac nwmac modified the milestones: v2.10.0, v2.11.0 Jul 4, 2024
@gaktive gaktive modified the milestones: v2.11.0, v2.10.0 Jul 4, 2024
@gaktive gaktive modified the milestones: v2.10.0, v2.11.0 Oct 2, 2024
@kkaempf kkaempf changed the title Cluster labels cannot be removed, updated or modified [SURE-6536] Cluster labels cannot be removed, updated or modified Oct 10, 2024
@mmartin24
Copy link

FTR: our UI e2e tests for fleet have noticed an increase on the failiures due to this since 2.9 (example) while removing labels. while it seemed to have been relatively ok in 2.8.
image

@torchiaf
Copy link
Member

Related issue: #11241

@kinarashah
Copy link
Member

@eva-vashkevich FYI, backend PR for this issue has been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests