You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We observed increased latency on clusters with many coredns replicas (100). Right now, coredns uses HPA for scaling replica numbers but we could investigate if using VPA would mitigate such an issue.
The text was updated successfully, but these errors were encountered:
In that cluster, what I see is that we are maxing HPA due memory consumption.
Memory grows based on cluster size + load, but from the CPU metric we can say each coredns pod is not heavily loaded. When the load is low, cluster size has a bigger representation in the memory consumption, causing HPA to scale up unnecessarily.
We could combine HPA (CPU only) and VPA (mem only); this way we can scale horizontally based on load and allocate more memory on each pod depending on the cluster size.
This is possible, but we need to figure out how to properly test it.
The other issue to solve is that we can't ship the VPA CR in the coredns app.
This is possible, but we need to figure out how to properly test it.
I'll try using https://github.com/coredns/perf-tests/blob/master/kubernetes. Basically what it's doing is creating a bunch of pods, headless services and services to load coredns as we need in our case. If that works we can implement it as a test in our CI.
We observed increased latency on clusters with many coredns replicas (100). Right now, coredns uses HPA for scaling replica numbers but we could investigate if using VPA would mitigate such an issue.
The text was updated successfully, but these errors were encountered: