Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make clusteradm accept idempotent #395

Open
nirs opened this issue Nov 22, 2023 · 0 comments
Open

Make clusteradm accept idempotent #395

nirs opened this issue Nov 22, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@nirs
Copy link
Contributor

nirs commented Nov 22, 2023

Describe the bug

Running clusteradm accept multiple times should succeed if the cluster is already accepted, but
it fails in some cases.

To Reproduce

  1. Build ocm hub and 2 managed clusters using minikube vms
  2. When both managed clusters are connected, stop one minikube vm
  3. Start the minikube vm and run clusteradm accept ... again
  4. clusteradm accept fails with
      drenv.commands.Error: Command failed:
         command: ('clusteradm', 'accept', '--clusters', 'dr2', '--wait', '--context', 'hub')
         exitcode: 1
         error:
            Error: context deadline exceeded

Running manually we see that clusteradm is in an endless loop:

Joining cluster 'hub'
Please log onto the hub cluster and run the following command:

    clusteradm accept --clusters dr2

Accepting cluster
no CSR to approve for cluster dr2
hubAcceptsClient already set for managed cluster dr2

 Your managed cluster dr2 has joined the Hub successfully. Visit https://open-cluster-management.io/scenarios or https://github.com/open-cluster-management-io/OCM/tree/main/solutions for next steps.
no CSR to approve for cluster dr2
hubAcceptsClient already set for managed cluster dr2

 Your managed cluster dr2 has joined the Hub successfully. Visit https://open-cluster-management.io/scenarios or https://github.com/open-cluster-management-io/OCM/tree/main/solutions for next steps.
no CSR to approve for cluster dr2
hubAcceptsClient already set for managed cluster dr2

...

Why run clusteradm again? We have automation build the minikube clusters, connecting them with clusteradm and installing many other components. The entire automation is idempotent, so any failures can be fixed by starting again with partly deployed clusters.

Expected behavior
If the managed clusters is already accepted, consider the operation successful.

Environment ie: OCM version, clusteradm version, Kubernetes version and provider:

$ clusteradm version
client		version	:v0.7.1
server release	version	:v1.27.4
default bundle	version	:0.12.0

$ clusteradm get hub-info --context hub
Registration Operator:
  Controller:	(1/1) quay.io/open-cluster-management/registration-operator:v0.12.0
  CustomResourceDefinition:
    (installed) clustermanagers.operator.open-cluster-management.io [*v1]
Components:
  Registration:
    Controller:	(1/1) quay.io/open-cluster-management/registration:v0.12.0
    Webhook:	(1/1) quay.io/open-cluster-management/registration:v0.12.0
  Work:
    Webhook:	(1/1) quay.io/open-cluster-management/work:v0.12.0
  Placement:
    Controller:	(1/1) quay.io/open-cluster-management/placement:v0.12.0
  CustomResourceDefinition:
    (installed) managedclustersetbindings.cluster.open-cluster-management.io [*v1beta2]
    (installed) placements.cluster.open-cluster-management.io [*v1beta1]
    (installed) clustermanagementaddons.addon.open-cluster-management.io [*v1alpha1]
    (installed) managedclusteraddons.addon.open-cluster-management.io [*v1alpha1]
    (installed) managedclusters.cluster.open-cluster-management.io [*v1]
    (installed) managedclustersets.cluster.open-cluster-management.io [*v1beta2]
    (installed) manifestworkreplicasets.work.open-cluster-management.io [*v1alpha1]
    (installed) manifestworks.work.open-cluster-management.io [*v1]
    (installed) placementdecisions.cluster.open-cluster-management.io [*v1beta1]
    (installed) addondeploymentconfigs.addon.open-cluster-management.io [*v1alpha1]
    (installed) addonplacementscores.cluster.open-cluster-management.io [*v1alpha1]
    (installed) addontemplates.addon.open-cluster-management.io [*v1alpha1]

Additional context

We can work around this by skipping the accept call if the managed cluster is already accepted:
RamenDR/ramen#1106

@nirs nirs added the bug Something isn't working label Nov 22, 2023
nirs added a commit to nirs/ramen that referenced this issue Nov 23, 2023
To avoid idempotency issues in `clusteradm accept`[1] enable the
ManagedClusterAutoApproval feature gate, so `clusteradm accept` is not
needed.

Another way to solve this is to add `--skip-approve-check` option in
`clusteradm accept` but the approval step is not needed in context of a
testing environment.

[1] open-cluster-management-io/clusteradm#395

Thanks: Mike Ng <[email protected]>
Signed-off-by: Nir Soffer <[email protected]>
nirs added a commit to nirs/ramen that referenced this issue Nov 30, 2023
To avoid idempotency issues in `clusteradm accept`[1] enable the
ManagedClusterAutoApproval feature gate, so `clusteradm accept` is not
needed.

Another way to solve this is to add `--skip-approve-check` option in
`clusteradm accept` but the approval step is not needed in context of a
testing environment.

[1] open-cluster-management-io/clusteradm#395

Thanks: Mike Ng <[email protected]>
Signed-off-by: Nir Soffer <[email protected]>
nirs added a commit to nirs/ramen that referenced this issue Nov 30, 2023
To avoid idempotency issues in `clusteradm accept`[1] enable the
ManagedClusterAutoApproval feature gate, so `clusteradm accept` is not
needed.

Another way to solve this is to add `--skip-approve-check` option in
`clusteradm accept` but the approval step is not needed in context of a
testing environment.

[1] open-cluster-management-io/clusteradm#395

Thanks: Mike Ng <[email protected]>
Signed-off-by: Nir Soffer <[email protected]>
raghavendra-talur pushed a commit to RamenDR/ramen that referenced this issue Dec 8, 2023
To avoid idempotency issues in `clusteradm accept`[1] enable the
ManagedClusterAutoApproval feature gate, so `clusteradm accept` is not
needed.

Another way to solve this is to add `--skip-approve-check` option in
`clusteradm accept` but the approval step is not needed in context of a
testing environment.

[1] open-cluster-management-io/clusteradm#395

Thanks: Mike Ng <[email protected]>
Signed-off-by: Nir Soffer <[email protected]>
ShyamsundarR pushed a commit to red-hat-storage/ramen that referenced this issue Dec 13, 2023
To avoid idempotency issues in `clusteradm accept`[1] enable the
ManagedClusterAutoApproval feature gate, so `clusteradm accept` is not
needed.

Another way to solve this is to add `--skip-approve-check` option in
`clusteradm accept` but the approval step is not needed in context of a
testing environment.

[1] open-cluster-management-io/clusteradm#395

Thanks: Mike Ng <[email protected]>
Signed-off-by: Nir Soffer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant