-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregate NodeConfig status conditions #1918
Aggregate NodeConfig status conditions #1918
Conversation
@rzetelskik: GitHub didn't allow me to request PR reviews from the following users: rzetelskik. Note that only scylladb members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Skipping CI for Draft Pull Request. |
5269fcb
to
b7b4c9b
Compare
37352a5
to
c0f8abe
Compare
/test all |
7c30291
to
f75cb9b
Compare
/test images e2e-gke-serial |
f75cb9b
to
a9d4381
Compare
/test all |
@tnozicka you mention here #1557 (comment) that the tests should be extended with trying out broken raid/mount configurations, but after thinking this through I don't think they should be a part of the e2e suite - condition aggregation is covered by unit tests, and invalid configurations shouldn't go past validation. Any scenarios with corrupted devices require some hacks on the host, are prone to be flaky and are difficult to even come up with - e.g. a scenario working on my local setup wouldn't reproduce in CI. |
/cc zimnx |
a9d4381
to
2763581
Compare
I don't think validation has a chance to assess this. Say something is already mounted at the target path - it's not something to be assessed on the API level.
I think it should be possible to come up with at least one without node changes, like trying to mount something over |
/hold cancel |
2132ccf
to
77c0be6
Compare
rebased |
@rzetelskik: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
known manager flake |
77c0be6
to
71a20de
Compare
…st for degraded condition propagation
71a20de
to
5a9d52c
Compare
cond := scyllav1alpha1.NodeConfigCondition{ | ||
Type: scyllav1alpha1.NodeConfigReconciledConditionType, | ||
ObservedGeneration: nc.Generation, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conditions are not technically part of the API version / can be unknown or unset - but I agree, it should be kept, at least for a while, to retain compatibility and behaviour
cond := scyllav1alpha1.NodeConfigCondition{ | ||
Type: scyllav1alpha1.NodeConfigReconciledConditionType, | ||
ObservedGeneration: nc.Generation, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd put this into syncDaemonSet
though, something like
desiredSum := int64(0)
allReconciled := true
for _, requiredDaemonSet := range requiredDaemonSets {
if requiredDaemonSet == nil {
continue
}
ds, _, err := resourceapply.ApplyDaemonSet(ctx, ncc.kubeClient.AppsV1(), ncc.daemonSetLister, ncc.eventRecorder, requiredDaemonSet, resourceapply.ApplyOptions{})
if err != nil {
return progressingConditions, fmt.Errorf("can't apply daemonset: %w", err)
}
desiredSum += int64(ds.Status.DesiredNumberScheduled)
reconciled, err := controllerhelpers.IsDaemonSetRolledOut(ds)
if err != nil {
return nil, fmt.Errorf("can't determine is a daemonset %q is reconiled: %w", naming.ObjRef(ds), err)
}
if !reconciled {
allReconciled = false
}
}
status.DesiredNodeSetupCount = pointer.Ptr(desiredSum)
reconciledCondition := metav1.Condition{
Type: string(scyllav1alpha1.NodeConfigReconciledConditionType),
ObservedGeneration: nc.Generation,
Status: metav1.ConditionUnknown,
}
if allReconciled {
reconciledCondition.Status = metav1.ConditionTrue
reconciledCondition.Reason = "FullyReconciledAndUp"
reconciledCondition.Message = "All operands are reconciled and available."
} else {
reconciledCondition.Status = metav1.ConditionFalse
reconciledCondition.Reason = "DaemonSetNotRolledOut"
reconciledCondition.Message = "DaemonSet isn't reconciled and fully rolled out yet."
}
_ = apimeta.SetStatusCondition(statusConditions, reconciledCondition)
@zimnx ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
/assign tnozicka
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/lgtm
cond := scyllav1alpha1.NodeConfigCondition{ | ||
Type: scyllav1alpha1.NodeConfigReconciledConditionType, | ||
ObservedGeneration: nc.Generation, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen the new projection at a time I wrote this - looking at it, it changes the semantics of it but I guess that's ok
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rzetelskik, tnozicka, zimnx The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Description of your changes: Currently, node setup controller only sets up conditions in the form of
Node%sAvailable
and the like. They are not practical and can't be easily used to query the NodeConfig status. This PR makes it so that NodeConfig conditions are aggregated to the genericAvailable
,Progressing
andDegraded
, by:Replacing the proprietaryEdit: we can't do in this API version due to metav1.Condition having a tighter validation. Left TODOs about making this change in the next API version.NodeConfigCondition
with metav1.Condition in NodeConfig API. What follows is that the helpers for NodeConfigCondition are no longer needed, so any related code is removed.NodeConfigReconciledConditionType
. The standard workload conditions are now also used to calculate the deprecated condition.Which issue is resolved by this Pull Request:
Resolves #1557
/kind feature
/priority important-soon
/cc