Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KubernetesPodOperator dry_run failure #45812

Open
2 tasks done
baryluk opened this issue Jan 20, 2025 · 3 comments
Open
2 tasks done

KubernetesPodOperator dry_run failure #45812

baryluk opened this issue Jan 20, 2025 · 3 comments
Assignees
Labels
area:providers good first issue kind:bug This is a clearly a bug provider:cncf-kubernetes Kubernetes provider related issues

Comments

@baryluk
Copy link
Contributor

baryluk commented Jan 20, 2025

Apache Airflow Provider(s)

cncf-kubernetes

Versions of Apache Airflow Providers

apache-airflow-providers-cncf-kubernetes 4.3.0

Apache Airflow version

2.3.4

Operating System

Linux

Deployment

Official Apache Airflow Helm Chart

Deployment details

n/a

What happened

We are upgrading from apache-airflow-providers-cncf-kubernetes 3.0.0 to 4.3.0 (going slowly through releases).

We have a custom script, that during docker image build of our airflow, tests all dags and all dag tasks in dry_run mode. Mostly to detect Python syntax errors, dag cycles duplicate tasks, wrong imports, ntemplating errors etc.

This was working all fine with our existing airflow, but we decided to upgrade airflow to newer version, and that also means updating airflow providers. After fixing bunch of other issues, I found the issues with KubernetedPodOperator dry run.

New dry_run added in d56ff76 invokes KubernetesPodOperator build_pod_request_obj() method which has a call to a property self.hook.is_in_cluster:

        pod.metadata.labels.update(
            {
                'airflow_version': airflow_version.replace('+', '-'),
                'airflow_kpo_in_cluster': str(self.hook.is_in_cluster),
            }
        )

Unfortunately this property constructs a Kube API client object which requires kube client config / credentials to work.

    @property
    def is_in_cluster(self):
        """Expose whether the hook is configured with ``load_incluster_config`` or not"""
        if self._is_in_cluster is not None:
            return self._is_in_cluster
        self.api_client  # so we can determine if we are in_cluster or not
        return self._is_in_cluster```

This causes dry_run to not able to execute in isolated test environment:

Traceback (most recent call last):
  File "<stdin>", line 97, in <module>
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/dag.py", line 2307, in cli
    args.func(args, self)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/cli_parser.py", line 51, in command
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/cli.py", line 99, in wrapper
    return f(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/commands/task_command.py", line 545, in task_test
    ti.dry_run()
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1815, in dry_run
    self.task.dry_run()
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 607, in dry_run
    pod = self.build_pod_request_obj()
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 595, in build_pod_request_obj
    'airflow_kpo_in_cluster': str(self.hook.is_in_cluster),
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 283, in is_in_cluster
    self.api_client  # so we can determine if we are in_cluster or not
  File "/usr/local/lib/python3.9/functools.py", line 993, in __get__
    val = self.func(instance)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 291, in api_client
    return self.get_conn()
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 239, in get_conn
    config.load_kube_config(
  File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/config/kube_config.py", line 808, in load_kube_config
    loader = _get_kube_config_loader(
  File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/config/kube_config.py", line 767, in _get_kube_config_loader
    raise ConfigException(
kubernetes.config.config_exception.ConfigException: Invalid kube-config file. No configuration found.

We would like to continue using dry_run, but be able to run it without providing credentials or kube config. It does not need to be 100% accurate.

Two options:

  • env var to bypass setting of airflow_kpo_in_cluster label in dry run mode, if user requests to do so.
  • never populate it in dry_run mode. (change signature of build_pod_request_obj to have dry_run: bool = False kwarg and invoke it with dry_run=True in KubernetesPodOperator.dry_run()` method.

(or both)

What you think should happen instead

n/a

How to reproduce

n/a

Anything else

n/a

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@baryluk baryluk added area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Jan 20, 2025
@dosubot dosubot bot added the provider:cncf-kubernetes Kubernetes provider related issues label Jan 20, 2025
@potiuk potiuk added good first issue and removed needs-triage label for new issues that we didn't triage yet labels Jan 20, 2025
@potiuk
Copy link
Member

potiuk commented Jan 20, 2025

Feel free to attempt to fix it and provide PR

@baryluk
Copy link
Contributor Author

baryluk commented Jan 21, 2025

@potiuk Sure, I can, but I wanted to get some feedback from kubernetes provider maintainers first what would they prefer.

@potiuk
Copy link
Member

potiuk commented Jan 21, 2025

There are no "kubernetes provider maintainers" here. It's all "airflow" maintainers - anyone can comment here, and any maintainer can approve PR that is created. Since we are in the hottest part of Airlfow 3 building, it's not very likely that you will get more feedback than that, so creating a good PR with proposal how you would like to solve it is the best way to grab attention of maintainers who could approve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers good first issue kind:bug This is a clearly a bug provider:cncf-kubernetes Kubernetes provider related issues
Projects
None yet
Development

No branches or pull requests

2 participants