Context and background of this VPA Tracker is available on this Elotl blog post: VPA Tracker
In this README we provide detailed steps on how the VPA Tracker stack is setup and used to track a sample deployment workload.
We will use an EKS cluster for this example:
eksctl create cluster -f sel-vpa-ekscluster-luna.yamlRename the cluster context for ease of use:
kubectl config rename-context [email protected] sel-vpaInstall the Luna autoscaler:
% cd luna/luna-v1.2.18/eks
% ./deploy.sh --name sel-vpa --region us-west-1 --additional-helm-values "--set loggingVerbosity=5 --set telemetry=false --debug"We use this open-source project to install Prometheus and Grafana: kube-prometheus-stack project
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stackHere is a sample of the success message after installation:
% helm install prometheus prometheus-community/kube-prometheus-stack
NAME: prometheus
LAST DEPLOYED: Tue Jun 10 14:28:00 2025
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
kubectl --namespace default get pods -l "release=prometheus"
Get Grafana 'admin' user password by running:
kubectl --namespace default get secrets prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 -d ; echo
Access Grafana local instance:
export POD_NAME=$(kubectl --namespace default get pod -l "app.kubernetes.io/name=grafana,app.kubernetes.io/instance=prometheus" -oname)
kubectl --namespace default port-forward $POD_NAME 3000
Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.We install VPA using instructions from here: VPA Installation
git clone https://github.com/kubernetes/autoscaler.git % kubectl config use-context sel-vpaSwitched to context "sel-vpa".
% ./hack/vpa-up.shSample output from a successful install:
...
Generating certs for the VPA Admission Controller in /tmp/vpa-certs.
Certificate request self-signature ok
subject=CN = vpa-webhook.kube-system.svc
Uploading certs to the cluster.
secret/vpa-tls-certs created
Deleting /tmp/vpa-certs.
service/vpa-webhook created
deployment.apps/vpa-admission-controller createdThese are the pods of the VPA:
% kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
...SNIP...
kube-system vpa-admission-controller-5d79d9f956-qsqhq 1/1 Running 0 2m48s
kube-system vpa-recommender-544df95d65-n4qjv 1/1 Running 0 2m50s
kube-system vpa-updater-54ddf66b6d-smlnq 1/1 Running 0 2m51sBy default, the VPA recommender exposes metrics on port 8942.
We need to ensure that there is a Kubernetes Service for the recommender pod that maps to this port.
kubectl apply -f vpa-tracker/vpa-metrics-expose-svc.yamlservice/vpa-recommender createdkubectl apply -f vpa-tracker/vpa-recommender-servicemonitor.yaml- Prometheus service:
kubectl port-forward svc/prometheus-kube-prometheus-prometheus 9090:9090We can then access the Prometheus UI in a browser: http://localhost:9090/query
- Check kube state metrics being exported:
kubectl port-forward svc/prometheus-kube-state-metrics 8080:8080We can then access the Prometheus kube state metrics in a browser: http://localhost:8080/metrics
- Grafana service
kubectl port-forward svc/prometheus-grafana 3000:80We can then access the Grafana UI in a browser: http://localhost:3000
We can login with username admin and password determined from the following command:
kubectl --namespace default get secrets prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 -d ; echoOnce Prometheus starts scraping, we will be able to find metrics such as these:
vpa_target_container_recommendation
vpa_recommendation_cpu_lower_bound
vpa_recommendation_memory_upper_bound
We use a sample workload in this repo to illustrate VPA and vpa-tracker operation.
Let's first create the workload:
kubectl apply -f workload-c.yaml verticalpodautoscaler.autoscaling.k8s.io/workload-c-vpa created
deployment.apps/workload-c createdLet's view the VPA custom resource object. Initially there CPU and memory recommendations are empty since there is not yet sufficient data.
kubectl get vpaNAME MODE CPU MEM PROVIDED AGE
workload-c-vpa Auto False 43skubectl get podsNAME READY STATUS RESTARTS AGE
alertmanager-prometheus-kube-prometheus-alertmanager-0 2/2 Running 0 117m
prometheus-grafana-9676cd6bf-nqj7p 3/3 Running 0 117m
prometheus-kube-prometheus-operator-5c6d5464db-tghxq 1/1 Running 0 117m
prometheus-kube-state-metrics-f8fc86d54-4d5gj 1/1 Running 0 117m
prometheus-prometheus-kube-prometheus-prometheus-0 2/2 Running 0 117m
prometheus-prometheus-node-exporter-bmwkm 0/1 ContainerCreating 0 3s
prometheus-prometheus-node-exporter-dwgf8 1/1 Running 0 117m
workload-c-6884ffcd9d-2xwxp 0/1 Pending 0 49s
workload-c-6884ffcd9d-57kh6 0/1 Pending 0 49sGiven below is an example of the VPA object along with CPU and memory recommendations:
% kubectl get vpa
NAME MODE CPU MEM PROVIDED AGE
workload-c-vpa Auto 163m 262144k True 4d19hkubectl get nodesWe see that a new node has been created to accomodate the workload:
NAME STATUS ROLES AGE VERSION
ip-192-168-116-111.us-west-1.compute.internal Ready <none> 20s v1.32.3-eks-473151a
ip-192-168-54-249.us-west-1.compute.internal Ready <none> 134m v1.32.3-eks-473151aThe following commands create the VPA metrics exporter:
% kubectl create namespace monitoringnamespace/monitoring created% kubectl apply -f vpa-metrics-exporter/vpa_exporter_default.yamldeployment.apps/vpa-exporter created
service/vpa-exporter created
servicemonitor.monitoring.coreos.com/vpa-exporter created
serviceaccount/vpa-exporter-sa created
clusterrole.rbac.authorization.k8s.io/vpa-exporter-role unchanged
clusterrolebinding.rbac.authorization.k8s.io/vpa-exporter-rolebinding unchangedWe add labels to the metrics exporter:
kubectl label servicemonitor vpa-exporter release=kube-prometheus-stack --overwrite% kubectl port-forward svc/vpa-exporter 8080
Forwarding from 127.0.0.1:8080 -> 8080Check if vpa exporter is exporting the new VPA metrics:
# HELP vpa_cpu_target_millicores VPA recommended target CPU for container (in millicores)
# TYPE vpa_cpu_target_millicores gauge
vpa_cpu_target_millicores{container="workload-c",namespace="default",vpa_name="workload-c-vpa"} 2000.0
# HELP vpa_cpu_uncapped_target_millicores VPA uncapped target CPU for container (in millicores)
# TYPE vpa_cpu_uncapped_target_millicores gauge
vpa_cpu_uncapped_target_millicores{container="workload-c",namespace="default",vpa_name="workload-c-vpa"} 4742.0
# HELP vpa_memory_target_bytes VPA recommended target memory for container (in bytes)
# TYPE vpa_memory_target_bytes gauge
vpa_memory_target_bytes{container="workload-c",namespace="default",vpa_name="workload-c-vpa"} 262144.0
# HELP vpa_memory_uncapped_target_bytes VPA uncapped target memory for container (in bytes)
# TYPE vpa_memory_uncapped_target_bytes gauge
vpa_memory_uncapped_target_bytes{container="workload-c",namespace="default",vpa_name="workload-c-vpa"} 262144.0The Prometheus custom resource is updated so it has the right labels:
kubectl label namespace default monitoring-key=enabledBefore editing the Prometheus CRD:
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector:
matchLabels:
release: prometheusAfter editing:
serviceMonitorNamespaceSelector:
matchLabels:
monitoring-key: enabled
serviceMonitorSelector:
matchLabels:
release: kube-prometheus-stackFor these new settings to come into effect, we delete the Prometheus pod as shown below:
kubectl delete pod prometheus-prometheus-kube-prometheus-prometheus-0% kubectl get svcNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 4h13m
kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 4h36m
prometheus-grafana ClusterIP 10.100.109.134 <none> 80/TCP 4h13m
prometheus-kube-prometheus-alertmanager ClusterIP 10.100.56.241 <none> 9093/TCP,8080/TCP 4h13m
prometheus-kube-prometheus-operator ClusterIP 10.100.37.114 <none> 443/TCP 4h13m
prometheus-kube-prometheus-prometheus ClusterIP 10.100.169.54 <none> 9090/TCP,8080/TCP 4h13m
prometheus-kube-state-metrics ClusterIP 10.100.185.222 <none> 8080/TCP 4h13m
prometheus-operated ClusterIP None <none> 9090/TCP 4h13m
prometheus-prometheus-node-exporter ClusterIP 10.100.134.255 <none> 9100/TCP 4h13m
vpa-exporter ClusterIP 10.100.140.15 <none> 8080/TCP 43mkubectl port-forward svc/prometheus-kube-prometheus-prometheus 9090
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080We can view a list of metrics in text: http://localhost:9090/metrics
Prometheus Query UI: http://localhost:9090/query
Prometheus Targets UI: http://localhost:9090/targets
We now look for the vpa-exporter under "ServiceMonitor / monitoring / vpa-exporter" and ensure that it's status is shown as UP.
Using the exported VPA metrics, we can setup custom panels in Grafana. This will allow us to monitor both current CPU usage as well as VPA recommendations over time.
Shown below is an example of a custom panel illustrating a scale up of CPU usage: