-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix!: include names of decorator argument references when building python env #3687
base: main
Are you sure you want to change the base?
Conversation
2b77d3f
to
ff5ad46
Compare
"stop_after_attempt": Executable( | ||
payload="from tenacity.stop import stop_after_attempt", kind=ExecutableKind.IMPORT | ||
), | ||
"wrapped_f": Executable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this import included twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section brings in the additional "weird" references:
(Pdb) p list(zip(code.co_freevars, func.__closure__))
[('f', <cell at 0x143ae7430: function object at 0x143b1dc60>), ('self', <cell at 0x143ae73a0: Retrying object at 0x143ae6600>), ('wrapped_f', <cell at 0x143ae73d0: function object at 0x143b1e160>)]
This checks out with tenacity's decorator implementation (notice the f
and wrapped_f
definitions): https://github.com/jd/tenacity/blob/0d40e76f7d06d631fb127e1ec58c8bd776e70d49/tenacity/__init__.py#L322-L346.
So, regarding the two imports:
- The first one corresponds to the
ref
extracted using_code_globals(code)
on theexecute
function (it brings in the names['fetch_data', 'pd']
) - The second one corresponds to
wrapped_f
.
I have an idea on how to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@izeigerman @tobymao thinking out loud: what about only extracting closure variables for functions that are defined within the module path corresponding to the active sqlmesh context? Something along the lines of:
- if func.__closure__:
- for var, value in zip(code.co_freevars, func.__closure__):
- variables[var] = value.cell_contents
+ if func.__closure__ and _is_relative_to(func.__globals__.get("__file__"), path):
+ for var, value in zip(code.co_freevars, func.__closure__):
+ variables[var] = value.cell_contents
+
+ if hasattr(func, "__wrapped__"):
+ variables.update(func_globals(func.__wrapped__, path=path))
Given these changes I'm seeing two failed tests:
y
needs to be excluded here'f'
,'func'
and'wrapped_f'
need to be excluded from thetest_serialize_env
test in this PR (functools
is still being picked up bydecorator_vars
above this section)
I added the branch with __wrapped__
because without it the retry
and stop_after_attempt
dependencies aren't included in the serialized env, which I believe is incorrect?
@@ -316,4 +326,25 @@ def test_context_manager(): | |||
kind=ExecutableKind.IMPORT, | |||
), | |||
"wraps": Executable(payload="from functools import wraps", kind=ExecutableKind.IMPORT), | |||
"functools": Executable(payload="import functools", kind=ExecutableKind.IMPORT), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we even capture this import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it's included in argument list of fetch_data
's decorator:
(Pdb) p decorator_vars(func, root_node=root_node)
['wraps', 'f', 'functools', 'WRAPPER_ASSIGNMENTS']
Due to this line, specifically.
return 'test data'""", | ||
name="fetch_data", | ||
path="test_metaprogramming.py", | ||
alias="f", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does his alias come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this reply answers it.
Fixes #3640
The main difficulty in solving the linked issue was making SQLMesh detect the
stop_after_attempt
reference, in order to extract and inject it into the python environment so it can be (de)serialized:The decorator names themselves were already being picked up in the previous
decorators
helper. In this PR I expand that functionality by also searching for the decorator call arguments' references.I implemented a new visitor class in order to limit the search to only the decorator sub-trees, instead of traversing the whole function tree. Without this, we'd have to do a nested
walk
within the root node traversal loop, leading to unnecessarily revisiting nodes under the decorator sub-trees.Other than that, I also made sure to exclude callable instances of classes, because they can't be serialized. One such example is the
tenacity.Retrying
class (source). Letting these instances into the python environment doesn't "just work" unfortunately, because an error is raised once we reach this section, since there's no__name__
attribute in them.I verified that this fix works for a project with the model of interest, as well as the example project we have under
test_metaprogramming.py
.