You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 24, 2019. It is now read-only.
In my mind it is not always necessary to have a buffering enabled for the whole cluster, furthermore buffering it is not a guarantee of a smooth scale up in scenarios where the buffer is equally distributed across the large number of nodes, hence none providing enough resource to adopt a pod.
Alternative way would be to mark the pods(or deployment) with annotation indicating that this application requires fast scaling, in response to the annotation cluster-autoscaler will maintain a set of ghost pods (placeholders) having same resource request, but which can be killed at any moment and having a real pod scheduled in its place. Cluster autoscaler will detect unschedulable pod -> find if it has ghost pods and kill it. Upon the rescheduling of the real pod, fake pod needs to be recreated, and if not trigger ASG scale-up. This will provide a fast response and a guarantee that a pod will find a place to be scheduled on. This provides a control on an application level, as we can imagine not all application require fast scale-up and can tolerate dropped requests. Plus this allows to simply change the annotation to prepare for the expected load increase, as well as automate the process to scale up/down based on day time or week day by changing the number of ghost pods.
Guarantee of having the right pod getting the ghost pod slot can be achieve through taints on node, preventing from other pods getting its place.
Thoughts ?
The text was updated successfully, but these errors were encountered:
I like the idea of "ghost pods" to ensure quick autoscaling (or deployment) of certain apps. You are right, the current percentage based buffer does not ensure that a slot for a critical app is really available.
In my mind it is not always necessary to have a buffering enabled for the whole cluster, furthermore buffering it is not a guarantee of a smooth scale up in scenarios where the buffer is equally distributed across the large number of nodes, hence none providing enough resource to adopt a pod.
Alternative way would be to mark the pods(or deployment) with annotation indicating that this application requires fast scaling, in response to the annotation cluster-autoscaler will maintain a set of ghost pods (placeholders) having same resource request, but which can be killed at any moment and having a real pod scheduled in its place. Cluster autoscaler will detect unschedulable pod -> find if it has ghost pods and kill it. Upon the rescheduling of the real pod, fake pod needs to be recreated, and if not trigger ASG scale-up. This will provide a fast response and a guarantee that a pod will find a place to be scheduled on. This provides a control on an application level, as we can imagine not all application require fast scale-up and can tolerate dropped requests. Plus this allows to simply change the annotation to prepare for the expected load increase, as well as automate the process to scale up/down based on day time or week day by changing the number of ghost pods.
Guarantee of having the right pod getting the ghost pod slot can be achieve through taints on node, preventing from other pods getting its place.
Thoughts ?
The text was updated successfully, but these errors were encountered: