[Infra] Allow configurable scaling algorithm #2011

rwkarg · 2021-08-05T01:26:41Z

rwkarg
Aug 5, 2021

The current scaling method is hard coded as a linear scaling of ceiling(obsered_metric/target_metric)
Example:
observed_metric = 100 backlog/instance
target_metric = 10 backlog/instance

ceiling(100/10)
ceiling(10)
10 = new scale target (capped by min/max configured instance count)

An interesting long term direction would be to allow this scaling algorithm to be configured with alternate implementations (not necessarily user-provided implementations, but at least a selection of provided ones). This would allow for something like a hill-climbing algorithm as discussed here: kedacore/http-add-on#142 (reply in thread)

This would allow for experimentation with different algorithms that can have better behaviors than the current linear scaling method.

For example, consider a message processing workload that is scaled based on a queue backlog that processes messages by persisting to a backend datastore. As the backlog increases, more workload instances will be provisioned, with the expectation that more instances will result in higher throughput. However, if the backlog is the result of slower processing because the backend datastore is unable to keep up with demand (disk failure, low resources, networking, anything), then adding more workload instances may well cause more pressure on the backend datastore and result in even lower overall throughput making the situation worse.
A better option would be a scaling implementation like a hill climbing algorithm that monitors how changes in workload instances impacts a target metric (this would likely be an target overall metric like total throughput, not a per instance metric as currently implemented) and probes whether increases actually help meet that target or make it worse.
This would prevent the current linear scaling implementation's pathological case where (assuming the above message processing workload) the backend datastore is temporarily unavailable for a sustained period of time, causing a backlog large enough to scale workers up to the configure maximum number of instance (costing $$$) while making zero progress on processing any of the messages.

The example hill climbing algorithm may not be the best for all use cases, but having a configurable scaling algorithm would allow for selection as well as relatively low risk experimentation of new implementations that address scenarios that result in poor performance from the existing scaling implementations.

tomkerkhove · 2021-08-05T06:43:06Z

tomkerkhove
Aug 5, 2021
Collaborator

I like this idea, but think this will not be possible since we rely on HPA for this.

But I'm sure @zroubalik will be able to tell us;

3 replies

tomkerkhove Aug 5, 2021
Collaborator

In another thread @zroubalik pointed https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-configurable-scaling-behavior out which should be supported and could be a solution to your needs here @rwkarg which you can configure through the SO/SJ advanced config

rwkarg Aug 5, 2021
Author

It’s not quite the level of configuration to address the above case, but the HPA does seem like the place that would need to be extended to enable that ability.

zroubalik Aug 5, 2021
Maintainer

That's definitely a valid ask. But as you mentioned, we would have to drive this through HPA, or we would have to reimplement the scaling mechanism ourselves.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Infra] Allow configurable scaling algorithm #2011

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

[Infra] Allow configurable scaling algorithm #2011

rwkarg Aug 5, 2021

Replies: 1 comment · 3 replies

tomkerkhove Aug 5, 2021 Collaborator

tomkerkhove Aug 5, 2021 Collaborator

rwkarg Aug 5, 2021 Author

zroubalik Aug 5, 2021 Maintainer

rwkarg
Aug 5, 2021

Replies: 1 comment 3 replies

tomkerkhove
Aug 5, 2021
Collaborator

tomkerkhove Aug 5, 2021
Collaborator

rwkarg Aug 5, 2021
Author

zroubalik Aug 5, 2021
Maintainer