Context and background of this VPA Tracker is available on this Elotl blog post: VPA Tracker
In this README we provide detailed steps on how the VPA Tracker stack is setup and used to track a sample deployment workload.
We will use an EKS cluster for this example:
eksctl create cluster -f sel-vpa-ekscluster-luna.yaml
Rename the cluster context for ease of use:
kubectl config rename-context [email protected] sel-vpa
Install the Luna autoscaler:
% cd luna/luna-v1.2.18/eks
% ./deploy.sh --name sel-vpa --region us-west-1 --additional-helm-values "--set loggingVerbosity=5 --set telemetry=false --debug"
We use this open-source project to install Prometheus and Grafana: kube-prometheus-stack project
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack
Here is a sample of the success message after installation:
% helm install prometheus prometheus-community/kube-prometheus-stack
NAME: prometheus
LAST DEPLOYED: Tue Jun 10 14:28:00 2025
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
kubectl --namespace default get pods -l "release=prometheus"
Get Grafana 'admin' user password by running:
kubectl --namespace default get secrets prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 -d ; echo
Access Grafana local instance:
export POD_NAME=$(kubectl --namespace default get pod -l "app.kubernetes.io/name=grafana,app.kubernetes.io/instance=prometheus" -oname)
kubectl --namespace default port-forward $POD_NAME 3000
Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
We install VPA using instructions from here: VPA Installation
git clone https://github.com/kubernetes/autoscaler.git
% kubectl config use-context sel-vpa
Switched to context "sel-vpa".
% ./hack/vpa-up.sh
Sample output from a successful install:
...
Generating certs for the VPA Admission Controller in /tmp/vpa-certs.
Certificate request self-signature ok
subject=CN = vpa-webhook.kube-system.svc
Uploading certs to the cluster.
secret/vpa-tls-certs created
Deleting /tmp/vpa-certs.
service/vpa-webhook created
deployment.apps/vpa-admission-controller created
These are the pods of the VPA:
% kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
...SNIP...
kube-system vpa-admission-controller-5d79d9f956-qsqhq 1/1 Running 0 2m48s
kube-system vpa-recommender-544df95d65-n4qjv 1/1 Running 0 2m50s
kube-system vpa-updater-54ddf66b6d-smlnq 1/1 Running 0 2m51s
By default, the VPA recommender exposes metrics on port 8942.
We need to ensure that there is a Kubernetes Service for the recommender pod that maps to this port.
kubectl apply -f vpa-tracker/vpa-metrics-expose-svc.yaml
service/vpa-recommender created
kubectl apply -f vpa-tracker/vpa-recommender-servicemonitor.yaml
- Prometheus service:
kubectl port-forward svc/prometheus-kube-prometheus-prometheus 9090:9090
We can then access the Prometheus UI in a browser: http://localhost:9090/query
- Check kube state metrics being exported:
kubectl port-forward svc/prometheus-kube-state-metrics 8080:8080
We can then access the Prometheus kube state metrics in a browser: http://localhost:8080/metrics
- Grafana service
kubectl port-forward svc/prometheus-grafana 3000:80
We can then access the Grafana UI in a browser: http://localhost:3000
We can login with username admin
and password determined from the following command:
kubectl --namespace default get secrets prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 -d ; echo
Once Prometheus starts scraping, we will be able to find metrics such as these:
vpa_target_container_recommendation
vpa_recommendation_cpu_lower_bound
vpa_recommendation_memory_upper_bound
We use a sample workload in this repo to illustrate VPA and vpa-tracker operation.
Let's first create the workload:
kubectl apply -f workload-c.yaml
verticalpodautoscaler.autoscaling.k8s.io/workload-c-vpa created
deployment.apps/workload-c created
Let's view the VPA custom resource object. Initially there CPU and memory recommendations are empty since there is not yet sufficient data.
kubectl get vpa
NAME MODE CPU MEM PROVIDED AGE
workload-c-vpa Auto False 43s
kubectl get pods
NAME READY STATUS RESTARTS AGE
alertmanager-prometheus-kube-prometheus-alertmanager-0 2/2 Running 0 117m
prometheus-grafana-9676cd6bf-nqj7p 3/3 Running 0 117m
prometheus-kube-prometheus-operator-5c6d5464db-tghxq 1/1 Running 0 117m
prometheus-kube-state-metrics-f8fc86d54-4d5gj 1/1 Running 0 117m
prometheus-prometheus-kube-prometheus-prometheus-0 2/2 Running 0 117m
prometheus-prometheus-node-exporter-bmwkm 0/1 ContainerCreating 0 3s
prometheus-prometheus-node-exporter-dwgf8 1/1 Running 0 117m
workload-c-6884ffcd9d-2xwxp 0/1 Pending 0 49s
workload-c-6884ffcd9d-57kh6 0/1 Pending 0 49s
Given below is an example of the VPA object along with CPU and memory recommendations:
% kubectl get vpa
NAME MODE CPU MEM PROVIDED AGE
workload-c-vpa Auto 163m 262144k True 4d19h
kubectl get nodes
We see that a new node has been created to accomodate the workload:
NAME STATUS ROLES AGE VERSION
ip-192-168-116-111.us-west-1.compute.internal Ready <none> 20s v1.32.3-eks-473151a
ip-192-168-54-249.us-west-1.compute.internal Ready <none> 134m v1.32.3-eks-473151a
The following commands create the VPA metrics exporter:
% kubectl create namespace monitoring
namespace/monitoring created
% kubectl apply -f vpa-metrics-exporter/vpa_exporter_default.yaml
deployment.apps/vpa-exporter created
service/vpa-exporter created
servicemonitor.monitoring.coreos.com/vpa-exporter created
serviceaccount/vpa-exporter-sa created
clusterrole.rbac.authorization.k8s.io/vpa-exporter-role unchanged
clusterrolebinding.rbac.authorization.k8s.io/vpa-exporter-rolebinding unchanged
We add labels to the metrics exporter:
kubectl label servicemonitor vpa-exporter release=kube-prometheus-stack --overwrite
% kubectl port-forward svc/vpa-exporter 8080
Forwarding from 127.0.0.1:8080 -> 8080
Check if vpa exporter is exporting the new VPA metrics:
# HELP vpa_cpu_target_millicores VPA recommended target CPU for container (in millicores)
# TYPE vpa_cpu_target_millicores gauge
vpa_cpu_target_millicores{container="workload-c",namespace="default",vpa_name="workload-c-vpa"} 2000.0
# HELP vpa_cpu_uncapped_target_millicores VPA uncapped target CPU for container (in millicores)
# TYPE vpa_cpu_uncapped_target_millicores gauge
vpa_cpu_uncapped_target_millicores{container="workload-c",namespace="default",vpa_name="workload-c-vpa"} 4742.0
# HELP vpa_memory_target_bytes VPA recommended target memory for container (in bytes)
# TYPE vpa_memory_target_bytes gauge
vpa_memory_target_bytes{container="workload-c",namespace="default",vpa_name="workload-c-vpa"} 262144.0
# HELP vpa_memory_uncapped_target_bytes VPA uncapped target memory for container (in bytes)
# TYPE vpa_memory_uncapped_target_bytes gauge
vpa_memory_uncapped_target_bytes{container="workload-c",namespace="default",vpa_name="workload-c-vpa"} 262144.0
The Prometheus custom resource is updated so it has the right labels:
kubectl label namespace default monitoring-key=enabled
Before editing the Prometheus CRD:
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector:
matchLabels:
release: prometheus
After editing:
serviceMonitorNamespaceSelector:
matchLabels:
monitoring-key: enabled
serviceMonitorSelector:
matchLabels:
release: kube-prometheus-stack
For these new settings to come into effect, we delete the Prometheus pod as shown below:
kubectl delete pod prometheus-prometheus-kube-prometheus-prometheus-0
% kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 4h13m
kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 4h36m
prometheus-grafana ClusterIP 10.100.109.134 <none> 80/TCP 4h13m
prometheus-kube-prometheus-alertmanager ClusterIP 10.100.56.241 <none> 9093/TCP,8080/TCP 4h13m
prometheus-kube-prometheus-operator ClusterIP 10.100.37.114 <none> 443/TCP 4h13m
prometheus-kube-prometheus-prometheus ClusterIP 10.100.169.54 <none> 9090/TCP,8080/TCP 4h13m
prometheus-kube-state-metrics ClusterIP 10.100.185.222 <none> 8080/TCP 4h13m
prometheus-operated ClusterIP None <none> 9090/TCP 4h13m
prometheus-prometheus-node-exporter ClusterIP 10.100.134.255 <none> 9100/TCP 4h13m
vpa-exporter ClusterIP 10.100.140.15 <none> 8080/TCP 43m
kubectl port-forward svc/prometheus-kube-prometheus-prometheus 9090
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
We can view a list of metrics in text: http://localhost:9090/metrics
Prometheus Query UI: http://localhost:9090/query
Prometheus Targets UI: http://localhost:9090/targets
We now look for the vpa-exporter
under "ServiceMonitor / monitoring / vpa-exporter" and ensure that it's status is shown as UP.
Using the exported VPA metrics, we can setup custom panels in Grafana. This will allow us to monitor both current CPU usage as well as VPA recommendations over time.
Shown below is an example of a custom panel illustrating a scale up of CPU usage: