Replies: 1 comment 1 reply
-
Yes. Those are the plans - and we are working on that mostly as sub-AIPs of AIP-1 - Improve Airflow Security We've already implemented many steps to make it happen eventually. Most recently AIP-43 - DAG Processor separation that allows to have separate folders where each team can have their own DAG processor. Currently we are working on dependent AIP-44 Airflow Internal API which is needed to improve the isolation between User-provided code, tasks and the Database. That mostly will address the point 1. of yours. We've already started discussing what are the next AIPs to go - AIP-46 - Runtime isolation for Airlfow Tasks and DAG parsing and then future - not yet written AIP about fine-grained isolation of permissions between tasks (I have a discusion today that might end up with initial proposal for the AIP). This is the answer to your 2.I point of yours. Once all those are completed (I guess around 2.6 or 2.7), then I think the full isolation will be possible. Otherwise any isolation attempts are giving "false impression" of isolation - because there are various ways various users could escape the isolation. Only after all those point are implemented, true multi-tenancy will be possible You can watch our talk from this year's Airlfow Summit where together with @mhenc https://airflowsummit.org/sessions/2022/multitenancy-is-coming/ where we describe the work done, ahead of us and general roadmap. You can also follow #sig-mutltitenancy channel in our slack where we announce progress and propose meetings. Also you can see minutes and recordings from past meetings we had in AIP-1 (linked above)
This is possible even today with Git-Sync. You can even use submodules to get DAGs from multiple repoositories. You can learn more about it from a fantastic talk by Anum from Jagex https://airflowsummit.org/sessions/2022/manage-dags-at-scale/ - they manage big airflow installation with 190 or so repositoies joined via submodules and they show approach they made to manage DAGS from multiple teams. Watch it and you will see that this is possible even today (but does not yet provide full isolation until we implement all AIP-1-related stuff) |
Beta Was this translation helpful? Give feedback.
-
Description
I need a "workspace" or "DAG Group" to support multi-tenancy, there will be different projects within the organization that I hope can be isolated from each other
I expect Airflow to evolve into a platform rather than a product, so I hope it can load DAG files into SCM like Jenkins
Use case/motivation
I would like to add "workspace" or "DAG Group" and Airflow permission management to isolate different projects.
There will be a lot of business and line of business in an organization and there will be a lot of employees involved, and the same is true for data projects, so it is hoped that DAGs that do not pass through the line of business or between projects can be isolated from each other as "workspace" or "DAG Group"
I've been using Airflow for a while now, and I'm also trying to explore best practices, and with my current experience I hope Airflow can be a platform.
Currently, we have DAG source control in Gitlab and have built the Jenkins pipeline for CI/CD, and we need to use NFS on the Scheduler node and Worker node to implement the distribution of DAG code.
When we deploy a DAG, we need a pipeline to perform a "git pull" under the DAG Folder, so I hope Airflow can provide functionality that simplifies the current DAG onboarding process and Scheduler distribution to Worker's DAG code.
Related issues
No
Are you willing to submit a PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions