-
Notifications
You must be signed in to change notification settings - Fork 994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add topologySpreadConstraints configuration to pod spec. #2530
base: master
Are you sure you want to change the base?
Add topologySpreadConstraints configuration to pod spec. #2530
Conversation
b948c94
to
44fcb1f
Compare
We need that feature too. |
@@ -465,6 +465,11 @@ func (c *Cluster) compareStatefulSetWith(statefulSet *appsv1.StatefulSet) *compa | |||
needsRollUpdate = true | |||
reasons = append(reasons, "new statefulset's pod affinity does not match the current one") | |||
} | |||
if !reflect.DeepEqual(c.Statefulset.Spec.Template.Spec.TopologySpreadConstraints, statefulSet.Spec.Template.Spec.TopologySpreadConstraints) { | |||
needsReplace = true | |||
needsRollUpdate = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this really need to trigger a rolling update of pods executed by operator? Will not K8s take care of it then once the statefulset is replaced?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, but is this wrong too?
https://github.com/zalando/postgres-operator/blob/master/pkg/cluster/cluster.go#L472
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm good point. Maybe we can leave as is for now. With rolling update we make sure pods immediately adhere the new constraints.
Can you also write an e2e test that tests that the constraints work as expected, please? |
530f847
to
18023cb
Compare
434c6c5
to
ee2b43d
Compare
Dear @FxKu |
ee2b43d
to
24b9f65
Compare
Dear @FxKu |
@laiminhtrung1997 thanks a lot for the update. I think, in this state you can be sure we will merge it for the next release. We have to focus on the new status feature first but I will get back to you in September. |
256fd9f
to
fbac974
Compare
pkg/util/config/config.go
Outdated
@@ -254,6 +254,7 @@ type Config struct { | |||
EnableSecretsDeletion *bool `name:"enable_secrets_deletion" default:"true"` | |||
EnablePersistentVolumeClaimDeletion *bool `name:"enable_persistent_volume_claim_deletion" default:"true"` | |||
PersistentVolumeClaimRetentionPolicy map[string]string `name:"persistent_volume_claim_retention_policy" default:"when_deleted:retain,when_scaled:retain"` | |||
EnablePostgresTopologySpreadConstraints bool `json:"enable_postgres_topology_spread_constraints,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be removed. See comment above.
pkg/cluster/k8sres.go
Outdated
@@ -599,6 +599,22 @@ func generatePodAntiAffinity(podAffinityTerm v1.PodAffinityTerm, preferredDuring | |||
return podAntiAffinity | |||
} | |||
|
|||
func generateTopologySpreadConstraints(labels labels.Set, additionalTopologySpreadConstraints []v1.TopologySpreadConstraint) []v1.TopologySpreadConstraint { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Help me to understand what this function is doing:
- Do we have to define this first hard-coded TopologySpreadConstraint when somebody specifies constraints in the manifest?
- What would happen if it is missing?
- Should the operator always create this spread constraint, similar to the node affinities?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this function is:
- If
topologySpreadConstraints
is set totrue
andadditionalTopologySpreadConstraints
is either empty or undefined, the operator will apply default constraints as hardcoded. - If
additionalTopologySpreadConstraints
is defined, the specified list of constraints will be appended. - The
topologySpreadConstraints
setting is configured to make the constraints customizable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FxKu
Our logic might be different. Please let me know how you’d like the operator to apply the constraint, and I’ll implement it according to your suggestion. I'd love to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laiminhtrung1997 you have only described what the code does. I can read and understand code myself 😃
Please try to answer my questions. I'm wondering if the function is needed at all? Why not go with what people specify in the manifest? I should have made this thought more clear. Hope you will understand my questions better now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I got it. So, the only configuration input by users is topologySpreadConstraints, which will be generated in the manifest.
I will refactor it right away. Thank you.
nullable: true | ||
items: | ||
type: object | ||
x-kubernetes-preserve-unknown-fields: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not add all the fields of a topologySpreadConstraint like with what we have for nodeAffinity
. I feel, it's too lazy and unsafe to allow arbitrary fields with x-kubernetes-preserve-unknown-fields: true
XPreserveUnknownFields: util.True(), | ||
}, | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here with the XPreserveUnknownFields
. Yes there are other fields where are doing it like this, but lets get it right for new additions. I know, it's tedious to reflect the full schema because we don use a framework like kubeBulder. But it should be the trade-off for contributors when they go the "easy way" with allowing full specs in our manifest over custom stripped-down designs better suitable for end users.
@laiminhtrung1997 have you tested how topology spread constraints behave together with specified nodeAffinity in the manifest and globally configured pod anti affinity rules. How easy is it to create a scenario where they contradict themselves and lead to scheduling problems? Should one be used over the other? Maybe @monotek can answer this, too? |
5f916be
to
3e99f92
Compare
3e99f92
to
794f6db
Compare
Advice for the future: Don't force push and squash your commits in the middle of a review. Now it's super hard for me to see what feedback you've reflected and I have to review everything again 😞 |
It is note. I am truly sorry for this. It will not happen again. |
Currently, I have configured the operator using topologySpreadConstraints and Affinity.PodAntiAffinity together. My expectation is that the pods are always scheduled in different nodes and availability zones. This is my manifest for them.
Do you want me to put that scenario in the e2e test? |
Dear all,
I think we should configure topologySpreadConstraints to pod spec so these pods can spread zones for high availability.
Could someone review it, please? Thank you very much.
Best regards.