Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix!: include names of decorator argument references when building python env #3687

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

georgesittas
Copy link
Contributor

@georgesittas georgesittas commented Jan 22, 2025

Fixes #3640

The main difficulty in solving the linked issue was making SQLMesh detect the stop_after_attempt reference, in order to extract and inject it into the python environment so it can be (de)serialized:

@retry(stop=stop_after_attempt(3))
def fetch_data():
    return "test data"

The decorator names themselves were already being picked up in the previous decorators helper. In this PR I expand that functionality by also searching for the decorator call arguments' references.

I implemented a new visitor class in order to limit the search to only the decorator sub-trees, instead of traversing the whole function tree. Without this, we'd have to do a nested walk within the root node traversal loop, leading to unnecessarily revisiting nodes under the decorator sub-trees.

Other than that, I also made sure to exclude callable instances of classes, because they can't be serialized. One such example is the tenacity.Retrying class (source). Letting these instances into the python environment doesn't "just work" unfortunately, because an error is raised once we reach this section, since there's no __name__ attribute in them.

I verified that this fix works for a project with the model of interest, as well as the example project we have under test_metaprogramming.py.

@georgesittas georgesittas requested review from izeigerman, tobymao and a team January 22, 2025 19:46
@georgesittas georgesittas force-pushed the jo/extract_decorator_dependencies branch from 2b77d3f to ff5ad46 Compare January 23, 2025 13:38
"stop_after_attempt": Executable(
payload="from tenacity.stop import stop_after_attempt", kind=ExecutableKind.IMPORT
),
"wrapped_f": Executable(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this import included twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section brings in the additional "weird" references:

(Pdb) p list(zip(code.co_freevars, func.__closure__))
[('f', <cell at 0x143ae7430: function object at 0x143b1dc60>), ('self', <cell at 0x143ae73a0: Retrying object at 0x143ae6600>), ('wrapped_f', <cell at 0x143ae73d0: function object at 0x143b1e160>)]

This checks out with tenacity's decorator implementation (notice the f and wrapped_f definitions): https://github.com/jd/tenacity/blob/0d40e76f7d06d631fb127e1ec58c8bd776e70d49/tenacity/__init__.py#L322-L346.

So, regarding the two imports:

  • The first one corresponds to the ref extracted using _code_globals(code) on the execute function (it brings in the names ['fetch_data', 'pd'])
  • The second one corresponds to wrapped_f.

I have an idea on how to fix this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@izeigerman @tobymao thinking out loud: what about only extracting closure variables for functions that are defined within the module path corresponding to the active sqlmesh context? Something along the lines of:

-        if func.__closure__:
-            for var, value in zip(code.co_freevars, func.__closure__):
-                variables[var] = value.cell_contents
+        if func.__closure__ and _is_relative_to(func.__globals__.get("__file__"), path):
+                for var, value in zip(code.co_freevars, func.__closure__):
+                    variables[var] = value.cell_contents
+
+        if hasattr(func, "__wrapped__"):
+            variables.update(func_globals(func.__wrapped__, path=path))

Given these changes I'm seeing two failed tests:

  1. y needs to be excluded here
  2. 'f', 'func' and 'wrapped_f' need to be excluded from the test_serialize_env test in this PR (functools is still being picked up by decorator_vars above this section)

I added the branch with __wrapped__ because without it the retry and stop_after_attempt dependencies aren't included in the serialized env, which I believe is incorrect?

@@ -316,4 +326,25 @@ def test_context_manager():
kind=ExecutableKind.IMPORT,
),
"wraps": Executable(payload="from functools import wraps", kind=ExecutableKind.IMPORT),
"functools": Executable(payload="import functools", kind=ExecutableKind.IMPORT),
Copy link
Member

@izeigerman izeigerman Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we even capture this import?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's included in argument list of fetch_data's decorator:

(Pdb) p decorator_vars(func, root_node=root_node)
['wraps', 'f', 'functools', 'WRAPPER_ASSIGNMENTS']

Due to this line, specifically.

return 'test data'""",
name="fetch_data",
path="test_metaprogramming.py",
alias="f",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does his alias come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this reply answers it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SQLMesh decorator processing fails with AttributeError for imported decorators
2 participants