Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(deps): update helm release kube-prometheus-stack to v54.2.0 #615

Merged
merged 1 commit into from
Nov 22, 2023

Conversation

renovate[bot]
Copy link
Contributor

@renovate renovate bot commented Nov 22, 2023

Mend Renovate logo banner

This PR contains the following updates:

Package Update Change
kube-prometheus-stack (source) minor 54.1.0 -> 54.2.0

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.


Release Notes

prometheus-community/helm-charts (kube-prometheus-stack)

v54.2.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@prometheus-25.8.0...kube-prometheus-stack-54.2.0


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR has been generated by Mend Renovate. View repository job log here.

Copy link

--- . Kustomization: flux-system/flux-system HelmRelease: monitoring/kube-prometheus-stack

+++ . Kustomization: flux-system/flux-system HelmRelease: monitoring/kube-prometheus-stack

@@ -6,13 +6,13 @@

   namespace: monitoring
 spec:
   interval: 1h
   chart:
     spec:
       chart: kube-prometheus-stack
-      version: 54.1.0
+      version: 54.2.0
       sourceRef:
         kind: HelmRepository
         name: prometheus-community
         namespace: monitoring
       interval: 1h
   upgrade:

Copy link

--- . HelmRelease: monitoring/kube-prometheus-stack Deployment: monitoring/kube-prometheus-stack-grafana

+++ . HelmRelease: monitoring/kube-prometheus-stack Deployment: monitoring/kube-prometheus-stack-grafana

@@ -21,14 +21,14 @@

     metadata:
       labels:
         app.kubernetes.io/name: grafana
         app.kubernetes.io/instance: kube-prometheus-stack
       annotations:
         checksum/dashboards-json-config: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
-        checksum/sc-dashboard-provider-config: be5a77ee44971c65c9149638434153f216bc7c3069809cfc2249f654a17211ac
-        checksum/secret: fac9ea6701378fadc602867f242d80b0c558afa9498bb96e618efcbf4782b4eb
+        checksum/sc-dashboard-provider-config: 67d376fe7ba32f273f7d6cccbedffd96549db60d35fd48acf567641261e6d16b
+        checksum/secret: c7e83faabde7ebdd4c87f009f84bc77fd2e200c9c9c24ca4f8ddd2151d67a069
         kubectl.kubernetes.io/default-container: grafana
     spec:
       serviceAccountName: kube-prometheus-stack-grafana
       automountServiceAccountToken: true
       securityContext:
         fsGroup: 472
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules.pod-owner

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules.pod-owner

@@ -0,0 +1,65 @@

+---
+apiVersion: monitoring.coreos.com/v1
+kind: PrometheusRule
+metadata:
+  name: kube-prometheus-stack-k8s.rules.pod-owner
+  namespace: monitoring
+  labels:
+    app: kube-prometheus-stack
+    app.kubernetes.io/managed-by: Helm
+    app.kubernetes.io/instance: kube-prometheus-stack
+    app.kubernetes.io/part-of: kube-prometheus-stack
+    release: kube-prometheus-stack
+    heritage: Helm
+spec:
+  groups:
+  - name: k8s.rules.pod_owner
+    rules:
+    - expr: |-
+        max by (cluster, namespace, workload, pod) (
+          label_replace(
+            label_replace(
+              kube_pod_owner{job="kube-state-metrics", owner_kind="ReplicaSet"},
+              "replicaset", "$1", "owner_name", "(.*)"
+            ) * on (replicaset, namespace) group_left(owner_name) topk by (replicaset, namespace) (
+              1, max by (replicaset, namespace, owner_name) (
+                kube_replicaset_owner{job="kube-state-metrics"}
+              )
+            ),
+            "workload", "$1", "owner_name", "(.*)"
+          )
+        )
+      labels:
+        workload_type: deployment
+      record: namespace_workload_pod:kube_pod_owner:relabel
+    - expr: |-
+        max by (cluster, namespace, workload, pod) (
+          label_replace(
+            kube_pod_owner{job="kube-state-metrics", owner_kind="DaemonSet"},
+            "workload", "$1", "owner_name", "(.*)"
+          )
+        )
+      labels:
+        workload_type: daemonset
+      record: namespace_workload_pod:kube_pod_owner:relabel
+    - expr: |-
+        max by (cluster, namespace, workload, pod) (
+          label_replace(
+            kube_pod_owner{job="kube-state-metrics", owner_kind="StatefulSet"},
+            "workload", "$1", "owner_name", "(.*)"
+          )
+        )
+      labels:
+        workload_type: statefulset
+      record: namespace_workload_pod:kube_pod_owner:relabel
+    - expr: |-
+        max by (cluster, namespace, workload, pod) (
+          label_replace(
+            kube_pod_owner{job="kube-state-metrics", owner_kind="Job"},
+            "workload", "$1", "owner_name", "(.*)"
+          )
+        )
+      labels:
+        workload_type: job
+      record: namespace_workload_pod:kube_pod_owner:relabel
+
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-general.rules

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-general.rules

@@ -48,11 +48,12 @@

           other alerts.
           This alert fires whenever there's a severity="info" alert, and stops firing when another alert with a
           severity of 'warning' or 'critical' starts firing on the same namespace.
           This alert should be routed to a null receiver and configured to inhibit alerts with severity="info".
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/general/infoinhibitor
         summary: Info-level alert inhibition.
-      expr: ALERTS{severity = "info"} == 1 unless on(namespace) ALERTS{alertname !=
-        "InfoInhibitor", severity =~ "warning|critical", alertstate="firing"} == 1
+      expr: ALERTS{severity = "info"} == 1 unless on (namespace) ALERTS{alertname
+        != "InfoInhibitor", severity =~ "warning|critical", alertstate="firing"} ==
+        1
       labels:
         severity: none
 
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-storage

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-storage

@@ -27,15 +27,15 @@

           kubelet_volume_stats_available_bytes{job="kubelet", namespace=~".*", metrics_path="/metrics"}
             /
           kubelet_volume_stats_capacity_bytes{job="kubelet", namespace=~".*", metrics_path="/metrics"}
         ) < 0.03
         and
         kubelet_volume_stats_used_bytes{job="kubelet", namespace=~".*", metrics_path="/metrics"} > 0
-        unless on(namespace, persistentvolumeclaim)
+        unless on (namespace, persistentvolumeclaim)
         kube_persistentvolumeclaim_access_mode{ access_mode="ReadOnlyMany"} == 1
-        unless on(namespace, persistentvolumeclaim)
+        unless on (namespace, persistentvolumeclaim)
         kube_persistentvolumeclaim_labels{label_excluded_from_alerts="true"} == 1
       for: 1m
       labels:
         severity: critical
     - alert: KubePersistentVolumeFillingUp
       annotations:
@@ -52,15 +52,15 @@

           kubelet_volume_stats_capacity_bytes{job="kubelet", namespace=~".*", metrics_path="/metrics"}
         ) < 0.15
         and
         kubelet_volume_stats_used_bytes{job="kubelet", namespace=~".*", metrics_path="/metrics"} > 0
         and
         predict_linear(kubelet_volume_stats_available_bytes{job="kubelet", namespace=~".*", metrics_path="/metrics"}[6h], 4 * 24 * 3600) < 0
-        unless on(namespace, persistentvolumeclaim)
+        unless on (namespace, persistentvolumeclaim)
         kube_persistentvolumeclaim_access_mode{ access_mode="ReadOnlyMany"} == 1
-        unless on(namespace, persistentvolumeclaim)
+        unless on (namespace, persistentvolumeclaim)
         kube_persistentvolumeclaim_labels{label_excluded_from_alerts="true"} == 1
       for: 1h
       labels:
         severity: warning
     - alert: KubePersistentVolumeInodesFillingUp
       annotations:
@@ -74,15 +74,15 @@

           kubelet_volume_stats_inodes_free{job="kubelet", namespace=~".*", metrics_path="/metrics"}
             /
           kubelet_volume_stats_inodes{job="kubelet", namespace=~".*", metrics_path="/metrics"}
         ) < 0.03
         and
         kubelet_volume_stats_inodes_used{job="kubelet", namespace=~".*", metrics_path="/metrics"} > 0
-        unless on(namespace, persistentvolumeclaim)
+        unless on (namespace, persistentvolumeclaim)
         kube_persistentvolumeclaim_access_mode{ access_mode="ReadOnlyMany"} == 1
-        unless on(namespace, persistentvolumeclaim)
+        unless on (namespace, persistentvolumeclaim)
         kube_persistentvolumeclaim_labels{label_excluded_from_alerts="true"} == 1
       for: 1m
       labels:
         severity: critical
     - alert: KubePersistentVolumeInodesFillingUp
       annotations:
@@ -99,15 +99,15 @@

           kubelet_volume_stats_inodes{job="kubelet", namespace=~".*", metrics_path="/metrics"}
         ) < 0.15
         and
         kubelet_volume_stats_inodes_used{job="kubelet", namespace=~".*", metrics_path="/metrics"} > 0
         and
         predict_linear(kubelet_volume_stats_inodes_free{job="kubelet", namespace=~".*", metrics_path="/metrics"}[6h], 4 * 24 * 3600) < 0
-        unless on(namespace, persistentvolumeclaim)
+        unless on (namespace, persistentvolumeclaim)
         kube_persistentvolumeclaim_access_mode{ access_mode="ReadOnlyMany"} == 1
-        unless on(namespace, persistentvolumeclaim)
+        unless on (namespace, persistentvolumeclaim)
         kube_persistentvolumeclaim_labels{label_excluded_from_alerts="true"} == 1
       for: 1h
       labels:
         severity: warning
     - alert: KubePersistentVolumeErrors
       annotations:
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules

@@ -1,164 +0,0 @@

----
-apiVersion: monitoring.coreos.com/v1
-kind: PrometheusRule
-metadata:
-  name: kube-prometheus-stack-k8s.rules
-  namespace: monitoring
-  labels:
-    app: kube-prometheus-stack
-    app.kubernetes.io/managed-by: Helm
-    app.kubernetes.io/instance: kube-prometheus-stack
-    app.kubernetes.io/part-of: kube-prometheus-stack
-    release: kube-prometheus-stack
-    heritage: Helm
-spec:
-  groups:
-  - name: k8s.rules
-    rules:
-    - expr: |-
-        sum by (cluster, namespace, pod, container) (
-          irate(container_cpu_usage_seconds_total{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}[5m])
-        ) * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (
-          1, max by(cluster, namespace, pod, node) (kube_pod_info{node!=""})
-        )
-      record: node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate
-    - expr: |-
-        container_memory_working_set_bytes{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}
-        * on (cluster, namespace, pod) group_left(node) topk by(cluster, namespace, pod) (1,
-          max by(cluster, namespace, pod, node) (kube_pod_info{node!=""})
-        )
-      record: node_namespace_pod_container:container_memory_working_set_bytes
-    - expr: |-
-        container_memory_rss{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}
-        * on (cluster, namespace, pod) group_left(node) topk by(cluster, namespace, pod) (1,
-          max by(cluster, namespace, pod, node) (kube_pod_info{node!=""})
-        )
-      record: node_namespace_pod_container:container_memory_rss
-    - expr: |-
-        container_memory_cache{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}
-        * on (cluster, namespace, pod) group_left(node) topk by(cluster, namespace, pod) (1,
-          max by(cluster, namespace, pod, node) (kube_pod_info{node!=""})
-        )
-      record: node_namespace_pod_container:container_memory_cache
-    - expr: |-
-        container_memory_swap{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}
-        * on (cluster, namespace, pod) group_left(node) topk by(cluster, namespace, pod) (1,
-          max by(cluster, namespace, pod, node) (kube_pod_info{node!=""})
-        )
-      record: node_namespace_pod_container:container_memory_swap
-    - expr: |-
-        kube_pod_container_resource_requests{resource="memory",job="kube-state-metrics"}  * on (namespace, pod, cluster)
-        group_left() max by (namespace, pod, cluster) (
-          (kube_pod_status_phase{phase=~"Pending|Running"} == 1)
-        )
-      record: cluster:namespace:pod_memory:active:kube_pod_container_resource_requests
-    - expr: |-
-        sum by (namespace, cluster) (
-            sum by (namespace, pod, cluster) (
-                max by (namespace, pod, container, cluster) (
-                  kube_pod_container_resource_requests{resource="memory",job="kube-state-metrics"}
-                ) * on(namespace, pod, cluster) group_left() max by (namespace, pod, cluster) (
-                  kube_pod_status_phase{phase=~"Pending|Running"} == 1
-                )
-            )
-        )
-      record: namespace_memory:kube_pod_container_resource_requests:sum
-    - expr: |-
-        kube_pod_container_resource_requests{resource="cpu",job="kube-state-metrics"}  * on (namespace, pod, cluster)
-        group_left() max by (namespace, pod, cluster) (
-          (kube_pod_status_phase{phase=~"Pending|Running"} == 1)
-        )
-      record: cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests
-    - expr: |-
-        sum by (namespace, cluster) (
-            sum by (namespace, pod, cluster) (
-                max by (namespace, pod, container, cluster) (
-                  kube_pod_container_resource_requests{resource="cpu",job="kube-state-metrics"}
-                ) * on(namespace, pod, cluster) group_left() max by (namespace, pod, cluster) (
-                  kube_pod_status_phase{phase=~"Pending|Running"} == 1
-                )
-            )
-        )
-      record: namespace_cpu:kube_pod_container_resource_requests:sum
-    - expr: |-
-        kube_pod_container_resource_limits{resource="memory",job="kube-state-metrics"}  * on (namespace, pod, cluster)
-        group_left() max by (namespace, pod, cluster) (
-          (kube_pod_status_phase{phase=~"Pending|Running"} == 1)
-        )
-      record: cluster:namespace:pod_memory:active:kube_pod_container_resource_limits
-    - expr: |-
-        sum by (namespace, cluster) (
-            sum by (namespace, pod, cluster) (
-                max by (namespace, pod, container, cluster) (
-                  kube_pod_container_resource_limits{resource="memory",job="kube-state-metrics"}
-                ) * on(namespace, pod, cluster) group_left() max by (namespace, pod, cluster) (
-                  kube_pod_status_phase{phase=~"Pending|Running"} == 1
-                )
-            )
-        )
-      record: namespace_memory:kube_pod_container_resource_limits:sum
-    - expr: |-
-        kube_pod_container_resource_limits{resource="cpu",job="kube-state-metrics"}  * on (namespace, pod, cluster)
-        group_left() max by (namespace, pod, cluster) (
-         (kube_pod_status_phase{phase=~"Pending|Running"} == 1)
-         )
-      record: cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits
-    - expr: |-
-        sum by (namespace, cluster) (
-            sum by (namespace, pod, cluster) (
-                max by (namespace, pod, container, cluster) (
-                  kube_pod_container_resource_limits{resource="cpu",job="kube-state-metrics"}
-                ) * on(namespace, pod, cluster) group_left() max by (namespace, pod, cluster) (
-                  kube_pod_status_phase{phase=~"Pending|Running"} == 1
-                )
-            )
-        )
-      record: namespace_cpu:kube_pod_container_resource_limits:sum
-    - expr: |-
-        max by (cluster, namespace, workload, pod) (
-          label_replace(
-            label_replace(
-              kube_pod_owner{job="kube-state-metrics", owner_kind="ReplicaSet"},
-              "replicaset", "$1", "owner_name", "(.*)"
-            ) * on(replicaset, namespace) group_left(owner_name) topk by(replicaset, namespace) (
-              1, max by (replicaset, namespace, owner_name) (
-                kube_replicaset_owner{job="kube-state-metrics"}
-              )
-            ),
-            "workload", "$1", "owner_name", "(.*)"
-          )
-        )
-      labels:
-        workload_type: deployment
-      record: namespace_workload_pod:kube_pod_owner:relabel
-    - expr: |-
-        max by (cluster, namespace, workload, pod) (
-          label_replace(
-            kube_pod_owner{job="kube-state-metrics", owner_kind="DaemonSet"},
-            "workload", "$1", "owner_name", "(.*)"
-          )
-        )
-      labels:
-        workload_type: daemonset
-      record: namespace_workload_pod:kube_pod_owner:relabel
-    - expr: |-
-        max by (cluster, namespace, workload, pod) (
-          label_replace(
-            kube_pod_owner{job="kube-state-metrics", owner_kind="StatefulSet"},
-            "workload", "$1", "owner_name", "(.*)"
-          )
-        )
-      labels:
-        workload_type: statefulset
-      record: namespace_workload_pod:kube_pod_owner:relabel
-    - expr: |-
-        max by (cluster, namespace, workload, pod) (
-          label_replace(
-            kube_pod_owner{job="kube-state-metrics", owner_kind="Job"},
-            "workload", "$1", "owner_name", "(.*)"
-          )
-        )
-      labels:
-        workload_type: job
-      record: namespace_workload_pod:kube_pod_owner:relabel
-
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules.container-memory-rss

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules.container-memory-rss

@@ -0,0 +1,24 @@

+---
+apiVersion: monitoring.coreos.com/v1
+kind: PrometheusRule
+metadata:
+  name: kube-prometheus-stack-k8s.rules.container-memory-rss
+  namespace: monitoring
+  labels:
+    app: kube-prometheus-stack
+    app.kubernetes.io/managed-by: Helm
+    app.kubernetes.io/instance: kube-prometheus-stack
+    app.kubernetes.io/part-of: kube-prometheus-stack
+    release: kube-prometheus-stack
+    heritage: Helm
+spec:
+  groups:
+  - name: k8s.rules.container_memory_rss
+    rules:
+    - expr: |-
+        container_memory_rss{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}
+        * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (1,
+          max by (cluster, namespace, pod, node) (kube_pod_info{node!=""})
+        )
+      record: node_namespace_pod_container:container_memory_rss
+
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules.container-memory-swap

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules.container-memory-swap

@@ -0,0 +1,24 @@

+---
+apiVersion: monitoring.coreos.com/v1
+kind: PrometheusRule
+metadata:
+  name: kube-prometheus-stack-k8s.rules.container-memory-swap
+  namespace: monitoring
+  labels:
+    app: kube-prometheus-stack
+    app.kubernetes.io/managed-by: Helm
+    app.kubernetes.io/instance: kube-prometheus-stack
+    app.kubernetes.io/part-of: kube-prometheus-stack
+    release: kube-prometheus-stack
+    heritage: Helm
+spec:
+  groups:
+  - name: k8s.rules.container_memory_swap
+    rules:
+    - expr: |-
+        container_memory_swap{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}
+        * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (1,
+          max by (cluster, namespace, pod, node) (kube_pod_info{node!=""})
+        )
+      record: node_namespace_pod_container:container_memory_swap
+
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system-kubelet

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system-kubelet

@@ -41,17 +41,17 @@

       annotations:
         description: Kubelet '{{ $labels.node }}' is running at {{ $value | humanizePercentage
           }} of its Pod capacity.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubelettoomanypods
         summary: Kubelet is running at capacity.
       expr: |-
-        count by(cluster, node) (
-          (kube_pod_status_phase{job="kube-state-metrics",phase="Running"} == 1) * on(instance,pod,namespace,cluster) group_left(node) topk by(instance,pod,namespace,cluster) (1, kube_pod_info{job="kube-state-metrics"})
+        count by (cluster, node) (
+          (kube_pod_status_phase{job="kube-state-metrics",phase="Running"} == 1) * on (instance,pod,namespace,cluster) group_left(node) topk by (instance,pod,namespace,cluster) (1, kube_pod_info{job="kube-state-metrics"})
         )
         /
-        max by(cluster, node) (
+        max by (cluster, node) (
           kube_node_status_capacity{job="kube-state-metrics",resource="pods"} != 1
         ) > 0.95
       for: 15m
       labels:
         severity: info
     - alert: KubeNodeReadinessFlapping
@@ -80,14 +80,14 @@

       annotations:
         description: Kubelet Pod startup 99th percentile latency is {{ $value }} seconds
           on node {{ $labels.node }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletpodstartuplatencyhigh
         summary: Kubelet Pod startup latency is too high.
       expr: histogram_quantile(0.99, sum(rate(kubelet_pod_worker_duration_seconds_bucket{job="kubelet",
-        metrics_path="/metrics"}[5m])) by (cluster, instance, le)) * on(cluster, instance)
-        group_left(node) kubelet_node_name{job="kubelet", metrics_path="/metrics"}
+        metrics_path="/metrics"}[5m])) by (cluster, instance, le)) * on (cluster,
+        instance) group_left(node) kubelet_node_name{job="kubelet", metrics_path="/metrics"}
         > 60
       for: 15m
       labels:
         severity: warning
     - alert: KubeletClientCertificateExpiration
       annotations:
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubelet.rules

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubelet.rules

@@ -13,24 +13,24 @@

     heritage: Helm
 spec:
   groups:
   - name: kubelet.rules
     rules:
     - expr: histogram_quantile(0.99, sum(rate(kubelet_pleg_relist_duration_seconds_bucket{job="kubelet",
-        metrics_path="/metrics"}[5m])) by (cluster, instance, le) * on(cluster, instance)
+        metrics_path="/metrics"}[5m])) by (cluster, instance, le) * on (cluster, instance)
         group_left(node) kubelet_node_name{job="kubelet", metrics_path="/metrics"})
       labels:
         quantile: '0.99'
       record: node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile
     - expr: histogram_quantile(0.9, sum(rate(kubelet_pleg_relist_duration_seconds_bucket{job="kubelet",
-        metrics_path="/metrics"}[5m])) by (cluster, instance, le) * on(cluster, instance)
+        metrics_path="/metrics"}[5m])) by (cluster, instance, le) * on (cluster, instance)
         group_left(node) kubelet_node_name{job="kubelet", metrics_path="/metrics"})
       labels:
         quantile: '0.9'
       record: node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile
     - expr: histogram_quantile(0.5, sum(rate(kubelet_pleg_relist_duration_seconds_bucket{job="kubelet",
-        metrics_path="/metrics"}[5m])) by (cluster, instance, le) * on(cluster, instance)
+        metrics_path="/metrics"}[5m])) by (cluster, instance, le) * on (cluster, instance)
         group_left(node) kubelet_node_name{job="kubelet", metrics_path="/metrics"})
       labels:
         quantile: '0.5'
       record: node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile
 
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules.container-cpu-usage-seconds-tot

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules.container-cpu-usage-seconds-tot

@@ -0,0 +1,25 @@

+---
+apiVersion: monitoring.coreos.com/v1
+kind: PrometheusRule
+metadata:
+  name: kube-prometheus-stack-k8s.rules.container-cpu-usage-seconds-tot
+  namespace: monitoring
+  labels:
+    app: kube-prometheus-stack
+    app.kubernetes.io/managed-by: Helm
+    app.kubernetes.io/instance: kube-prometheus-stack
+    app.kubernetes.io/part-of: kube-prometheus-stack
+    release: kube-prometheus-stack
+    heritage: Helm
+spec:
+  groups:
+  - name: k8s.rules.container_cpu_usage_seconds_total
+    rules:
+    - expr: |-
+        sum by (cluster, namespace, pod, container) (
+          irate(container_cpu_usage_seconds_total{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}[5m])
+        ) * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (
+          1, max by (cluster, namespace, pod, node) (kube_pod_info{node!=""})
+        )
+      record: node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate
+
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules.container-resource

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules.container-resource

@@ -0,0 +1,86 @@

+---
+apiVersion: monitoring.coreos.com/v1
+kind: PrometheusRule
+metadata:
+  name: kube-prometheus-stack-k8s.rules.container-resource
+  namespace: monitoring
+  labels:
+    app: kube-prometheus-stack
+    app.kubernetes.io/managed-by: Helm
+    app.kubernetes.io/instance: kube-prometheus-stack
+    app.kubernetes.io/part-of: kube-prometheus-stack
+    release: kube-prometheus-stack
+    heritage: Helm
+spec:
+  groups:
+  - name: k8s.rules.container_resource
+    rules:
+    - expr: |-
+        kube_pod_container_resource_requests{resource="memory",job="kube-state-metrics"}  * on (namespace, pod, cluster)
+        group_left() max by (namespace, pod, cluster) (
+          (kube_pod_status_phase{phase=~"Pending|Running"} == 1)
+        )
+      record: cluster:namespace:pod_memory:active:kube_pod_container_resource_requests
+    - expr: |-
+        sum by (namespace, cluster) (
+            sum by (namespace, pod, cluster) (
+                max by (namespace, pod, container, cluster) (
+                  kube_pod_container_resource_requests{resource="memory",job="kube-state-metrics"}
+                ) * on (namespace, pod, cluster) group_left() max by (namespace, pod, cluster) (
+                  kube_pod_status_phase{phase=~"Pending|Running"} == 1
+                )
+            )
+        )
+      record: namespace_memory:kube_pod_container_resource_requests:sum
+    - expr: |-
+        kube_pod_container_resource_requests{resource="cpu",job="kube-state-metrics"}  * on (namespace, pod, cluster)
+        group_left() max by (namespace, pod, cluster) (
+          (kube_pod_status_phase{phase=~"Pending|Running"} == 1)
+        )
+      record: cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests
+    - expr: |-
+        sum by (namespace, cluster) (
+            sum by (namespace, pod, cluster) (
+                max by (namespace, pod, container, cluster) (
+                  kube_pod_container_resource_requests{resource="cpu",job="kube-state-metrics"}
+                ) * on (namespace, pod, cluster) group_left() max by (namespace, pod, cluster) (
+                  kube_pod_status_phase{phase=~"Pending|Running"} == 1
+                )
+            )
+        )
+      record: namespace_cpu:kube_pod_container_resource_requests:sum
+    - expr: |-
+        kube_pod_container_resource_limits{resource="memory",job="kube-state-metrics"}  * on (namespace, pod, cluster)
+        group_left() max by (namespace, pod, cluster) (
+          (kube_pod_status_phase{phase=~"Pending|Running"} == 1)
+        )
+      record: cluster:namespace:pod_memory:active:kube_pod_container_resource_limits
+    - expr: |-
+        sum by (namespace, cluster) (
+            sum by (namespace, pod, cluster) (
+                max by (namespace, pod, container, cluster) (
+                  kube_pod_container_resource_limits{resource="memory",job="kube-state-metrics"}
+                ) * on (namespace, pod, cluster) group_left() max by (namespace, pod, cluster) (
+                  kube_pod_status_phase{phase=~"Pending|Running"} == 1
+                )
+            )
+        )
+      record: namespace_memory:kube_pod_container_resource_limits:sum
+    - expr: |-
+        kube_pod_container_resource_limits{resource="cpu",job="kube-state-metrics"}  * on (namespace, pod, cluster)
+        group_left() max by (namespace, pod, cluster) (
+         (kube_pod_status_phase{phase=~"Pending|Running"} == 1)
+         )
+      record: cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits
+    - expr: |-
+        sum by (namespace, cluster) (
+            sum by (namespace, pod, cluster) (
+                max by (namespace, pod, container, cluster) (
+                  kube_pod_container_resource_limits{resource="cpu",job="kube-state-metrics"}
+                ) * on (namespace, pod, cluster) group_left() max by (namespace, pod, cluster) (
+                  kube_pod_status_phase{phase=~"Pending|Running"} == 1
+                )
+            )
+        )
+      record: namespace_cpu:kube_pod_container_resource_limits:sum
+
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-apps

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-apps

@@ -31,16 +31,16 @@

         description: Pod {{ $labels.namespace }}/{{ $labels.pod }} has been in a non-ready
           state for longer than 15 minutes.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubepodnotready
         summary: Pod has been in a non-ready state for more than 15 minutes.
       expr: |-
         sum by (namespace, pod, cluster) (
-          max by(namespace, pod, cluster) (
+          max by (namespace, pod, cluster) (
             kube_pod_status_phase{job="kube-state-metrics", namespace=~".*", phase=~"Pending|Unknown|Failed"}
-          ) * on(namespace, pod, cluster) group_left(owner_kind) topk by(namespace, pod, cluster) (
-            1, max by(namespace, pod, owner_kind, cluster) (kube_pod_owner{owner_kind!="Job"})
+          ) * on (namespace, pod, cluster) group_left(owner_kind) topk by (namespace, pod, cluster) (
+            1, max by (namespace, pod, owner_kind, cluster) (kube_pod_owner{owner_kind!="Job"})
           )
         ) > 0
       for: 15m
       labels:
         severity: warning
     - alert: KubeDeploymentGenerationMismatch
@@ -221,13 +221,13 @@

       annotations:
         description: Job {{ $labels.namespace }}/{{ $labels.job_name }} is taking
           more than {{ "43200" | humanizeDuration }} to complete.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubejobnotcompleted
         summary: Job did not complete in time
       expr: |-
-        time() - max by(namespace, job_name, cluster) (kube_job_status_start_time{job="kube-state-metrics", namespace=~".*"}
+        time() - max by (namespace, job_name, cluster) (kube_job_status_start_time{job="kube-state-metrics", namespace=~".*"}
           and
         kube_job_status_active{job="kube-state-metrics", namespace=~".*"} > 0) > 43200
       labels:
         severity: warning
     - alert: KubeJobFailed
       annotations:
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules.container-memory-cache

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-k8s.rules.container-memory-cache

@@ -0,0 +1,24 @@

+---
+apiVersion: monitoring.coreos.com/v1
+kind: PrometheusRule
+metadata:
+  name: kube-prometheus-stack-k8s.rules.container-memory-cache
+  namespace: monitoring
+  labels:
+    app: kube-prometheus-stack
+    app.kubernetes.io/managed-by: Helm
+    app.kubernetes.io/instance: kube-prometheus-stack
+    app.kubernetes.io/part-of: kube-prometheus-stack
+    release: kube-prometheus-stack
+    heritage: Helm
+spec:
+  groups:
+  - name: k8s.rules.container_memory_cache
+    rules:
+    - expr: |-
+        container_memory_cache{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}
+        * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (1,
+          max by (cluster, namespace, pod, node) (kube_pod_info{node!=""})
+        )
+      record: node_namespace_pod_container:container_memory_cache
+
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system-apiserver

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system-apiserver

@@ -19,47 +19,47 @@

       annotations:
         description: A client certificate used to authenticate to kubernetes apiserver
           is expiring in less than 7.0 days.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeclientcertificateexpiration
         summary: Client certificate is about to expire.
       expr: apiserver_client_certificate_expiration_seconds_count{job="apiserver"}
-        > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job="apiserver"}[5m])))
+        > 0 and on (job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job="apiserver"}[5m])))
         < 604800
       for: 5m
       labels:
         severity: warning
     - alert: KubeClientCertificateExpiration
       annotations:
         description: A client certificate used to authenticate to kubernetes apiserver
           is expiring in less than 24.0 hours.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeclientcertificateexpiration
         summary: Client certificate is about to expire.
       expr: apiserver_client_certificate_expiration_seconds_count{job="apiserver"}
-        > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job="apiserver"}[5m])))
+        > 0 and on (job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job="apiserver"}[5m])))
         < 86400
       for: 5m
       labels:
         severity: critical
     - alert: KubeAggregatedAPIErrors
       annotations:
         description: Kubernetes aggregated API {{ $labels.name }}/{{ $labels.namespace
           }} has reported errors. It has appeared unavailable {{ $value | humanize
           }} times averaged over the past 10m.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeaggregatedapierrors
         summary: Kubernetes aggregated API has reported errors.
-      expr: sum by(name, namespace, cluster)(increase(aggregator_unavailable_apiservice_total{job="apiserver"}[10m]))
+      expr: sum by (name, namespace, cluster)(increase(aggregator_unavailable_apiservice_total{job="apiserver"}[10m]))
         > 4
       labels:
         severity: warning
     - alert: KubeAggregatedAPIDown
       annotations:
         description: Kubernetes aggregated API {{ $labels.name }}/{{ $labels.namespace
           }} has been only {{ $value | humanize }}% available over the last 10m.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeaggregatedapidown
         summary: Kubernetes aggregated API is down.
-      expr: (1 - max by(name, namespace, cluster)(avg_over_time(aggregator_unavailable_apiservice{job="apiserver"}[10m])))
+      expr: (1 - max by (name, namespace, cluster)(avg_over_time(aggregator_unavailable_apiservice{job="apiserver"}[10m])))
         * 100 < 85
       for: 5m
       labels:
         severity: warning
     - alert: KubeAPIDown
       annotations:
--- . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-node.rules

+++ . HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-node.rules

@@ -13,22 +13,22 @@

     heritage: Helm
 spec:
   groups:
   - name: node.rules
     rules:
     - expr: |-
-        topk by(cluster, namespace, pod) (1,
+        topk by (cluster, namespace, pod) (1,
           max by (cluster, node, namespace, pod) (
             label_replace(kube_pod_info{job="kube-state-metrics",node!=""}, "pod", "$1", "pod", "(.*)")
         ))
       record: 'node_namespace_pod:kube_pod_info:'
     - expr: |-
         count by (cluster, node) (
           node_cpu_seconds_total{mode="idle",job="node-exporter"}
           * on (namespace, pod) group_left(node)
-          topk by(namespace, pod) (1, node_namespace_pod:kube_pod_info:)
+          topk by (namespace, pod) (1, node_namespace_pod:kube_pod_info:)
         )
       record: node:node_num_cpu:sum
     - expr: |-
         sum(
           node_memory_MemAvailable_bytes{job="node-exporter"} or
           (

@bo0tzz bo0tzz merged commit 2145acc into main Nov 22, 2023
2 checks passed
@renovate renovate bot deleted the renovate/kube-prometheus-stack-54.x branch November 22, 2023 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant