Skip to content

Pod Affinity Feature Causing Flink Pipeline Redeployment to Fail #674

@guruguha

Description

@guruguha

We have Flink Pipelines running in our production environment using spotify operator v0.4.2 release.

We wanted to upgrade to the latest release v0.5.0 which has added features of pod affinity. When we did this in lower environments, we saw that on flink pipeline redeploy, we get this error about HorizontalPodAutoscaler. Below is the error we see on the Flink Operator logs:

{"level":"error","ts":"2023-04-18T17:51:04Z","logger":"controllers.FlinkCluster","msg":"Failed to observe the current state","controller":"flinkcluster","controllerGroup":"flinkoperator.k8s.io","controllerKind":"FlinkCluster","FlinkCluster":{"name":"dataprep-v1","namespace":"flink-dataprep"},"namespace":"flink-dataprep","name":"dataprep-v1","reconcileID":"bfffad73-7557-4d9f-bc97-320fd42cc598","error":"no matches for kind \"HorizontalPodAutoscaler\" in version \"autoscaling/v2\"","stacktrace":"github.com/spotify/flink-on-k8s-operator/controllers/flinkcluster.(*FlinkClusterHandler).reconcile\n\t/workspace/controllers/flinkcluster/flinkcluster_controller.go:153\ngithub.com/spotify/flink-on-k8s-operator/controllers/flinkcluster.(*FlinkClusterReconciler).Reconcile\n\t/workspace/controllers/flinkcluster/flinkcluster_controller.go:97\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235"}

We added the pod affinity to our cluster spec and started seeing this failure. We didn't have this in the previous operator version.

Looks like HorizontalPodAutoscaler autoscalingv2 expects EKS cluster version to be 1.22/23+. Can someone confirm this behavior? Our EKS cluster is on 1.21 which the release notes says is the min required version.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions