You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In #119, a defaulting of missing replicas: field to 1 was added. This behaviour could impact reliability of a service. Defaulting was added to workaround HPA's deficiency (kubernetes/kubernetes#111781)
We are using a custom autoscaler, not HPA. Our deployments tend not to have replicas: field defined. Our expectations is that during new version rollout, the number of replicas is carried-over from previous version.
What we observed, when replicas: field is not set, argo-rollouts sets it to 1. After few seconds, our autoscaler sets replicas: back to the proper value. Unfortunately during those few seconds Kubernetes starts killing the pods, and until new pods are spun up and ready, the service is degraded.
To Reproduce
Prepare a deployment without a spec.replicas: field. Observe "Defaulting .spec.replica to 1" message in argo rollout's log.
Expected behavior
The number of replicas, if not defined, should be carried over from current state. The code should do something like
// In order to work with HPA, the rollout.Spec.Replica field cannot be nil. As a result, the controller will update
// the rollout to have the replicas field set to the current or the default value. see https://github.com/argoproj/argo-rollouts/issues/119
if rollout.Spec.Replicas == nil {
if rollout.Status.Replicas != nil {
replicas := rollout.Status.Replicas
} else {
replicas := pointer.Int32Ptr(defaults.DefaultReplicas)
}
logCtx.Info("Defaulting .spec.replicas to %d", replicas)
r.Spec.Replicas = replicas
(Above isn't syntactically correct, sorry).
rollout.status.replicas: is NOT marked as an optional field, so should be available all the time.
The value of 0 (zero) of the number of status replicas may be wanted/correct and should be carried over, too.
Screenshots
Version
1.7.0
Logs
# Paste the logs from the rollout controller
# Logs for the entire controller:
kubectl logs -n argo-rollouts deployment/argo-rollouts
# Logs for a specific rollout:
kubectl logs -n argo-rollouts deployment/argo-rollouts | grep rollout=<ROLLOUTNAME
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered:
Yeah this is a serious issue that affects me as well. I use an HPA and if I don't set the replicas field to the current replica count, it wipes out all but one of my pods before recreating them. I have persistent client connections I can't afford to lose in this way.
Checklist:
Describe the bug
In #119, a defaulting of missing
replicas:
field to 1 was added. This behaviour could impact reliability of a service. Defaulting was added to workaround HPA's deficiency (kubernetes/kubernetes#111781)We are using a custom autoscaler, not HPA. Our deployments tend not to have
replicas:
field defined. Our expectations is that during new version rollout, the number of replicas is carried-over from previous version.What we observed, when
replicas:
field is not set, argo-rollouts sets it to 1. After few seconds, our autoscaler setsreplicas:
back to the proper value. Unfortunately during those few seconds Kubernetes starts killing the pods, and until new pods are spun up and ready, the service is degraded.To Reproduce
Prepare a deployment without a
spec.replicas:
field. Observe"Defaulting .spec.replica to 1"
message in argo rollout's log.Expected behavior
The number of replicas, if not defined, should be carried over from current state. The code should do something like
(Above isn't syntactically correct, sorry).
rollout.status.replicas:
is NOT marked as an optional field, so should be available all the time.The value of
0
(zero) of the number of status replicas may be wanted/correct and should be carried over, too.Screenshots
Version
1.7.0
Logs
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered: