-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(utils/decorators): rewrite remove task decorator to use ast #43383
base: main
Are you sure you want to change the base?
Conversation
991cccc
to
4e926d5
Compare
d5e6684
to
606b2ea
Compare
def test_remove_decorator_including_comment(self): | ||
py_source = "@task.virtualenv\ndef f():\n# @task.virtualenv\nimport funcsigs" | ||
res = remove_task_decorator(python_source=py_source, task_decorator_name="@task.virtualenv") | ||
assert res == "def f():\n# @task.virtualenv\nimport funcsigs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure if it is fine to remove the comments after executing ast.unparse
. If it is not an allowed behavior, we might not be able to adopt ast as a solution, since it would not preserve comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to use libcst https://pypi.org/project/libcst/ - which we alredy consider to use for our "upgrade check" tooling -> #41641 (comment)
this way we could preserve comments, line numbers etc. I think the important thing we might want to keep here is debuggability and particularly line number references, and with AST we will likely loose it - not only the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc: @jscheffl @uranusjr @ashb @kaxil @tirkarthi ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no expertise in this matter so I can not judge this.
In general I'd favor is we refactor to ensure pytests are staying the same. I am also not sure why the line feed was removed in the other tests. Is this a compatibility limitation or just a cleanup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the difference of AST vs. CST.
-
(AST) Abstract Syntax Tree - > is the tree of syntax of Python code that contains only the "meaningful" tokens - that represent a "runtime" python bytecode (with stripped out comments and other irrelevant source code tokens - such as punctuations, idents, parentheses etc.) - https://en.wikipedia.org/wiki/Abstract_syntax_tree
-
(CST) Concrete Syntax Tree -> is the tree of syntax of Python code that represent the
source code
- not Python runtime bytecode (CST includes EOL characters and comments, indents and parentheses and punctuation marks) - but also all other non-runtime tokens - https://en.wikipedia.org/wiki/Parse_tree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(that's a bit simplified definition of course - but I think it describes the difference well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're doing code modification instead of checking only, we probably should go with libcst I think 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I happened to share this topic at PyCon Taiwan this year. If anyone is interested, this was the slide I used. The content around page 67 is the most relevant.
https://speakerdeck.com/leew/unleash-the-chaos-developing-a-linter-for-un-pythonic-code?slide=67
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it's better to leave dag authors' code information as much as possible in the rendered script.py
of virtualenv operator, ideally the unit tests should not be changed, let me try LibCST instead of AST here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you need any example, I used it to check default_deferrable here https://github.com/apache/airflow/blob/main/scripts/ci/pre_commit/check_deferrable_default.py
1ec8a0a
to
28d8f2c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we decide to use CST, just mark it as draft
28d8f2c
to
2a82c10
Compare
due to #42766 the dropping of python3.8 support, I rewrite the
remove_task_decorator
to achieve the same purpose through AST.TODO:
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.