-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Fallback to attributes on types.ModuleType if a symbol can't be found in locals or globals #13904
Conversation
|
dcccb7a
to
22fc347
Compare
CodSpeed Performance ReportMerging #13904 will not alter performanceComparing Summary
|
22fc347
to
7dabd5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but I do think we need to resolve the regression here, if at all possible; this is too much of an edge-case to pay that much perf for.
3739088
to
ae8ab30
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I assume you are still planning to try replacing the hard coded allowlist with a salsa query based on actual attributes found in typeshed, just noticed one minor terminology nit.
crates/red_knot_python_semantic/resources/mdtest/scopes/moduletype_attrs.md
Outdated
Show resolved
Hide resolved
2f14b09
to
abb4375
Compare
Okay, I've switched from the hardcoded allowlist to a Salsa-cached generated allowlist! Locally I'm consistently seeing a regression of 1-2% on the incremental benchmark, but Codspeed reports the benchmarks are neutral on this PR, so maybe what I'm seeing locally is just noise. In order to obtain a list of symbols that typeshed declares in the body of I've also fixed several edge cases regarding accessing members as attributes on imported modules. In some cases, whether a module "defines" a member can vary depending on whether you access the member as a global variable from inside the module or as an attribute on the module object after it's been imported elsewhere. TL;DR: ready for re-review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems unfortunate to me that we now have to track DECLARED
for all symbols just for this one use case.
} | ||
|
||
bitflags! { | ||
#[derive(Copy, Clone, Debug, Eq, PartialEq)] | ||
struct SymbolFlags: u8 { | ||
const IS_USED = 1 << 0; | ||
const IS_BOUND = 1 << 1; | ||
const IS_BOUND = 1 << 1; | ||
const IS_DECLARED = 1 << 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some comment explaining IS_DECLARED
's meaning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a link to Carl's essay at the top of https://github.com/astral-sh/ruff/blob/main/crates/red_knot_python_semantic/src/semantic_index/use_def.rs
1e40cec
to
9ff6f66
Compare
I would wager that this information will come in useful in other contexts in the future (though I can't right now say what -- it's just an intuition I have). However, one alternative way of tackling this would be to write a Python script that inspects the AST of the typeshed stub for This would be a more complicated than the current solution. But it would have the following advantages:
I'm not sure whether this would be better or worse on net than the solution I currently have in this PR -- but it's an alternative to consider, anyway! |
Yeah. I'm not a fan of it but I don't consider it blocking. I guess an alternative to this would be to look at the class type and iterate over its members. I don't think we support it today, but I suspect that's a capability we want long-term. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! A few comments that I think are worth at least testing and/or mentioning in TODOs, even if we don't make code changes for them right now.
an alternative to this would be to look at the class type and iterate over its members. I don't think we support it today, but I suspect that's a capability we want long-term.
I agree that we'll want this, and I think the best / most efficient way to implement this would be to iterate over symbols in the symbol table that are defined/declared in the class, using IS_DEFINED
and IS_DECLARED
flags. So I would flip it around and say this is not an alternative approach, rather it's another likely future use case for this flag.
crates/red_knot_python_semantic/resources/mdtest/scopes/moduletype_attrs.md
Outdated
Show resolved
Hide resolved
# TODO: this should probably be added to typeshed; not sure why it isn't? | ||
# error: [unresolved-reference] | ||
# revealed: Unbound | ||
reveal_type(__doc__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably because it's on builtins.object
?
The awkward thing is that I think __doc__
might be the only attribute on object
that is accessible in every module as a global name.
I think all of them except for __module__
are accessible on module objects as an attribute.
You'll know better than I whether it's likely we can get __doc__
added specifically to ModuleType
(despite it already being on object
) to help support fixing this TODO with fewer special cases, or whether we should instead just bite the bullet and handle it as a special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... I think it would be fine to add __doc__
to ModuleType
in typeshed if we add a comment saying why we have the "redundant" override from object
... I'll try to put up a typeshed PR today...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed python/typeshed#12918
reveal_type(__init__) | ||
``` | ||
|
||
## Accessed as attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__class__
is another one defined on object
that is accessible on ModuleType
instances, but not as a module global name.
__module__
is an odd special case in that typeshed says it exists on object
but it does not actually exist on module objects. Probably not worth modeling this LSP violation, it's fine if we just let it exist on all objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__module__
is an odd special case in that typeshed says it exists onobject
but it does not actually exist on module objects. Probably not worth modeling this LSP violation, it's fine if we just let it exist on all objects?
Sigh. __module__
really shouldn't be listed as an instance attribute on object
; it's just obviously incorrect. But I'm not going to restart that conversation 🙃
I think this is not worth special-casing in red-knot; we should just accept it as an inaccuracy in type inference that we have to live with unless and until it's fixed in typeshed.
crates/red_knot_python_semantic/resources/mdtest/scopes/moduletype_attrs.md
Show resolved
Hide resolved
…'t be found in locals or globals
9ff6f66
to
eee3e80
Compare
Summary
All modules in Python are instances of
types.ModuleType
. Various attributes ontypes.ModuleType
in typeshed represent "implicit globals" that are part of the namespace of every module in Python, even if not explicitly defined. As such, before declaring that a symbol is unbound (and in fact, before looking the symbol up in thebuiltins
scope) we should look the symbol up as an attribute ontypes.ModuleType
. The only attributes ontypes.ModuleType
that are not present as implicit globals are__dict__
,__getattr__
and__init__
.Test Plan
Markdown test added as
crates/red_knot_python_semantic/resources/mdtest/scopes/moduletype_attrs.md
. We also get to remove a false-positive error from thetomllib
benchmark 🎉