Skip to content

Cyclops operator creates orphan nodes when it is not able to completely drain the node #75

@skaushal-splunk

Description

@skaushal-splunk

Describe the bug
A node which is not attached to any autoscaling group is created when the draining of node fails for some reason.

To Reproduce

  1. Create a node which has pods in the pending state. This is one of the reasons why node draining fails. There are are other reasons why draining fails. We want to create a scenario where draining fails.
  2. Create a CycleNodeRequest to create a new node and detaches the old node.
  3. Create a CycleNodeStatus to drain the node.
  4. Run both CycleNodeRequest and CycleNodeStatus.

Current behavior
CycleNodeRequest works first, detaches the old node and creates a new node. CycleNodeStatus is not able to drain the node because it sees some pods in the pending status.

As a result we have an old node which is not attached to autoscaling group but still has pods running on it. New node comes up but only has daemonsets running on it.

Expected behavior

  1. The old node should be drained and the new node should have all the pods from the old node
  2. There should not be any node which is not attached to AutoScaling group.

Kubernetes Cluster Version
v1.25

Cyclops Version
v1.7.0

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions