poddisruptionbudget is not allowing any disruptions

I don't know the intentions of https://github.com/spotify/flink-on-k8s-operator/pull/353 and if it is supposed to be only for jobs or also for `TaskManager` or `JobManager`. In current setup, there is only one `PodDisruptionBudget` per cluster which includes all pods: jobs, taskmanager, jobmanager, ..., because the selector labels are not specific enough. Or the logic behind how desired number of pods is calculated is faulty.

spec of existing `PodDisruptionBudget`
```
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  creationTimestamp: "2023-04-24T13:58:26Z"
  generation: 1
  labels:
    app: flink
    cluster: flink-cluster-new
  name: flink-flink-cluster-new
  namespace: gorr
  ownerReferences:
  - apiVersion: flinkoperator.k8s.io/v1beta1
    blockOwnerDeletion: false
    controller: true
    kind: FlinkCluster
    name: flink-cluster-new
    uid: 91f1c563-ca72-4539-aeaa-586d57942cd5
  resourceVersion: "2016776591"
  uid: aa825413-a554-4fa0-a154-67120afdc135
spec:
  maxUnavailable: 0%
  selector:
    matchLabels:
      app: flink
      cluster: flink-cluster-new
status:
  conditions:
  - lastTransitionTime: "2023-04-24T13:58:26Z"
    message: jobs.batch does not implement the scale subresource
    observedGeneration: 1
    reason: SyncFailed
    status: "False"
    type: DisruptionAllowed
  currentHealthy: 0
  desiredHealthy: 4
  disruptionsAllowed: 0
  expectedPods: 4
  observedGeneration: 1
```

and running pods:
```
$ kgpol app=flink,cluster=flink-cluster-new
NAME                                        READY   STATUS    RESTARTS   AGE
flink-cluster-new-job-submitter-8nh27   1/1     Running   0          25m
flink-cluster-new-jobmanager-0          1/1     Running   0          25m
flink-cluster-new-taskmanager-0         1/1     Running   0          25m
flink-cluster-new-taskmanager-1         1/1     Running   0          25m
flink-cluster-new-taskmanager-2         1/1     Running   0          25m
```

This means that Pod can never be safely evicted to another node and just dies after the node is removed from the cluster or shutdown. I would prefer to have PdB per Pod type.

I have version `0.4.0`, I will try the `0.5.0` if there are any changes around this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

poddisruptionbudget is not allowing any disruptions #675

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

poddisruptionbudget is not allowing any disruptions #675

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions