-
Notifications
You must be signed in to change notification settings - Fork 73
Open
Description
I don't know the intentions of #353 and if it is supposed to be only for jobs or also for TaskManager or JobManager. In current setup, there is only one PodDisruptionBudget per cluster which includes all pods: jobs, taskmanager, jobmanager, ..., because the selector labels are not specific enough. Or the logic behind how desired number of pods is calculated is faulty.
spec of existing PodDisruptionBudget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
creationTimestamp: "2023-04-24T13:58:26Z"
generation: 1
labels:
app: flink
cluster: flink-cluster-new
name: flink-flink-cluster-new
namespace: gorr
ownerReferences:
- apiVersion: flinkoperator.k8s.io/v1beta1
blockOwnerDeletion: false
controller: true
kind: FlinkCluster
name: flink-cluster-new
uid: 91f1c563-ca72-4539-aeaa-586d57942cd5
resourceVersion: "2016776591"
uid: aa825413-a554-4fa0-a154-67120afdc135
spec:
maxUnavailable: 0%
selector:
matchLabels:
app: flink
cluster: flink-cluster-new
status:
conditions:
- lastTransitionTime: "2023-04-24T13:58:26Z"
message: jobs.batch does not implement the scale subresource
observedGeneration: 1
reason: SyncFailed
status: "False"
type: DisruptionAllowed
currentHealthy: 0
desiredHealthy: 4
disruptionsAllowed: 0
expectedPods: 4
observedGeneration: 1
and running pods:
$ kgpol app=flink,cluster=flink-cluster-new
NAME READY STATUS RESTARTS AGE
flink-cluster-new-job-submitter-8nh27 1/1 Running 0 25m
flink-cluster-new-jobmanager-0 1/1 Running 0 25m
flink-cluster-new-taskmanager-0 1/1 Running 0 25m
flink-cluster-new-taskmanager-1 1/1 Running 0 25m
flink-cluster-new-taskmanager-2 1/1 Running 0 25m
This means that Pod can never be safely evicted to another node and just dies after the node is removed from the cluster or shutdown. I would prefer to have PdB per Pod type.
I have version 0.4.0, I will try the 0.5.0 if there are any changes around this.
Metadata
Metadata
Assignees
Labels
No labels