-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Literal special form #13874
base: main
Are you sure you want to change the base?
Conversation
3fe84bd
to
fbcc66c
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! This task is actually quite a bit harder than I made it sound, I was forgetting some of the complexity in recognizing special forms :) This is a really good initial effort. Let me know if any of the comments below don't make sense or need further clarification.
crates/red_knot_python_semantic/resources/mdtest/literal/literal.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/literal/literal.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/unary/instance.md
Outdated
Show resolved
Hide resolved
@@ -1130,6 +1132,7 @@ impl<'db> KnownClass { | |||
Self::ModuleType => "ModuleType", | |||
Self::FunctionType => "FunctionType", | |||
Self::NoneType => "NoneType", | |||
Self::SpecialForm => "SpecialForm", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably this should match the actual name of the symbol in the typing
module?
Self::SpecialForm => "SpecialForm", | |
Self::SpecialForm => "_SpecialForm", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Un-resolved this comment, because it doesn't look addressed.) Is there a reason this needs to stay "SpecialForm"
and not match the actual name in the module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No sorry I accidentally force pushed.
let annotation_ty = self.infer_annotation_expression(annotation); | ||
let mut annotation_ty = self.infer_annotation_expression(annotation); | ||
|
||
// If the variable is annotation with SpecialForm then create a new class with name of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If the variable is annotation with SpecialForm then create a new class with name of the | |
// If the variable is annotated with SpecialForm then create a new class with name of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment also doesn't look resolved?
self.infer_subscript_expression(subscript); | ||
Type::Todo | ||
} | ||
ast::Expr::Subscript(subscript) => self.infer_subscript_expression(subscript), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is where we need the code to recognize special forms, but we should not be falling back to infer_subscript_expression
here (that's for value expressions), instead we should have a dedicated infer_subscript_type_expression
method, which should use infer_type_expression
on the value and the index, and for now handle only the case where the value is typing.Literal
special form, otherwise just return Todo
.
(The fact that infer_subscript_expression
was previously called here was just an easy placeholder way to ensure we cover all the sub-expressions, until we added proper support for inferring types correctly in type expressions.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. Just one question, should we use infer_type_expression
for all indexes?
For example Literal in here is defined with an expression in the index. Others have annotation_expression
in their grammar. So here I use infer_type_expression
for other things but if it's Literal I use infer_expression
.
My reasoning behind it was when the value is True
. The True
itself should not have any meaning when used alone in the type annotation.
bb69a4c
to
a5a4f7f
Compare
/// Lookup the type of `symbol` in the `_typeshed` module namespace. | ||
/// | ||
/// Returns `Unbound` if the `_typeshed` module isn't available for some reason. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Lookup the type of `symbol` in the `_typeshed` module namespace. | |
/// | |
/// Returns `Unbound` if the `_typeshed` module isn't available for some reason. | |
/// Lookup the type of `symbol` in the `typing` module namespace. | |
/// | |
/// Returns `Unbound` if the `typing` module isn't available for some reason. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was marked resolved, but it looks like it is still relevant and not addressed yet? I un-resolved it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I'm sorry about that it caused a lot of resolved ones to be unresolved again. I will keep the comments open so I check them again before requesing review.
I applied the comments will spend another day on adding diagnostic messages for https://typing.readthedocs.io/en/latest/spec/literal.html#legal-and-illegal-parameterizations |
3fc52a5
to
93e3358
Compare
Okay I added more parts of the legal and illegal parameters from the spec. Right now we have:
I think the remaining part is assignability check. Right now the Literals are unwrapped to their inner type I don't think this is the right way, is it? It works in the tests but I'm not sure if Literal types should carry some special flags with themselves. I did not find an easy way to error on things like Literal["foo".replace("o", "b")] because I cannot fully disable attribute expressions in the Literal since enum members are allowed.
Does this sound good? parenthesized Tuples are not allowed. Although pyright allows this in the doc is stated that tuples containing valid literal types are illegal. Tuples are valid in case of Literal["w", "r"] for example. Also I'm not correctly joining union when it's possible. I left it as a todo in the tests:
Please let me know what do you think. |
@@ -568,58 +575,76 @@ impl<'db> Type<'db> { | |||
|
|||
(Type::None, Type::Instance(class_type)) | (Type::Instance(class_type), Type::None) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking to go over all of the instance of Instance(class_type)
and rename to Instance(instance)
so it's not misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agree that we should do this before landing this PR. I did a couple more in the commit I just pushed, but not all of them.
} | ||
} | ||
|
||
fn infer_literal_parameter_type(&mut self, parameters: &ast::Expr) -> Type<'db> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created this function so I can call this on each parameter inside the [] so if we have Literal[Literal[expr]]
it's converted to Literal[expr]
value_ty => { | ||
let value_node = value.as_ref(); | ||
let slice_ty = self.infer_expression(slice); | ||
// TODO: currently the logic to get the type of type of a subscript in type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed to keep this here because we have a test case that checks we emit unsubscriptable error in type annotations so I kept it here with a todo to not break that test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you look carefully at those test cases, they all have their own TODO comments saying that they shouldn't emit that unsubscriptable error :)
This PR will eventually need to merge with #13943 so you can look at what @AlexWaygood did there and follow the same approach.
Explaining the commit I just pushed: My intent in suggesting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are looking really good here! I pushed some changes to the InstanceType
implementation (so it's not Salsa-interned), and left some comments on the inference implementation.
a3: Literal[-4] | ||
a4: Literal["hello world"] | ||
a5: Literal[b"hello world"] | ||
a6: Literal["hello world"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks identical to a4
?
crates/red_knot_python_semantic/resources/mdtest/literal/literal.md
Outdated
Show resolved
Hide resolved
/// Lookup the type of `symbol` in the `_typeshed` module namespace. | ||
/// | ||
/// Returns `Unbound` if the `_typeshed` module isn't available for some reason. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was marked resolved, but it looks like it is still relevant and not addressed yet? I un-resolved it.
@@ -1130,6 +1132,7 @@ impl<'db> KnownClass { | |||
Self::ModuleType => "ModuleType", | |||
Self::FunctionType => "FunctionType", | |||
Self::NoneType => "NoneType", | |||
Self::SpecialForm => "SpecialForm", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Un-resolved this comment, because it doesn't look addressed.) Is there a reason this needs to stay "SpecialForm"
and not match the actual name in the module?
Self::SpecialForm => { | ||
let t = typing_symbol_ty(db, self.as_str()); | ||
debug_assert!(t.is_unbound(), "special form not found"); | ||
t | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't do this assert for the other special forms, I don't think we need to here either.
Self::SpecialForm => { | |
let t = typing_symbol_ty(db, self.as_str()); | |
debug_assert!(t.is_unbound(), "special form not found"); | |
t | |
} | |
Self::SpecialForm => typing_symbol_ty(db, self.as_str()) |
value_ty => { | ||
let value_node = value.as_ref(); | ||
let slice_ty = self.infer_expression(slice); | ||
// TODO: currently the logic to get the type of type of a subscript in type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you look carefully at those test cases, they all have their own TODO comments saying that they shouldn't emit that unsubscriptable error :)
This PR will eventually need to merge with #13943 so you can look at what @AlexWaygood did there and follow the same approach.
// slice_ty is treated as expression because Literal accepts expression | ||
// inside the [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems out of place and probably unnecessary? There's nothing near this comment named slice_ty
, and I'm not sure what "treated as expression" means -- the slice of a subscript expression is an expression in the AST, that's just a fact.
match parameters { | ||
ruff_python_ast::Expr::StringLiteral(_) | ||
| ruff_python_ast::Expr::BytesLiteral(_) | ||
| ruff_python_ast::Expr::BooleanLiteral(_) | ||
// For enum values | ||
| ruff_python_ast::Expr::Attribute(_) | ||
// For Another Literal inside this Literal | ||
| ruff_python_ast::Expr::Subscript(_) | ||
| ruff_python_ast::Expr::NoneLiteral(_) => {} | ||
// for negative numbers | ||
ruff_python_ast::Expr::UnaryOp(ref u) if (u.op == UnaryOp::USub || u.op == UnaryOp::UAdd) && u.operand.is_number_literal_expr() => {} | ||
ruff_python_ast::Expr::NumberLiteral(ref number) if number.value.is_int() => {} | ||
ruff_python_ast::Expr::Tuple(ref t) if !t.parenthesized => {} | ||
_ => { | ||
self.add_diagnostic( | ||
parameters.into(), | ||
"invalid-literal-parameter", | ||
format_args!( | ||
"Type arguments for `Literal` must be None, a literal value (int, bool, str, or bytes), or an enum value", | ||
), | ||
); | ||
return Type::Unknown; | ||
} | ||
}; | ||
|
||
let slice_ty = self.infer_literal_parameter_type(parameters); | ||
|
||
match slice_ty { | ||
Type::Never | ||
| Type::Unknown | ||
| Type::Unbound | ||
| Type::Todo | ||
| Type::FunctionLiteral(_) | ||
| Type::ModuleLiteral(_) | ||
| Type::ClassLiteral(_) | ||
| Type::Union(_) | ||
| Type::Intersection(_) | ||
| Type::Any => { | ||
self.add_diagnostic( | ||
parameters.into(), | ||
"invalid-literal-parameter", | ||
format_args!( | ||
"Type arguments for `Literal` must be None, a literal value (int, bool, str, or bytes), or an enum value", | ||
), | ||
); | ||
Type::Unknown | ||
} | ||
Type::Tuple(tuple) => { | ||
let elts = tuple.elements(self.db); | ||
Type::Union(UnionType::new(self.db, elts)) | ||
} | ||
ty => ty, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better approach here would be for infer_literal_parameter_type
to return Option<Type<'db>>
, and return None
if the literal parameter is not valid, otherwise the right Type
. Then you only need to emit the error in one place (if infer_literal_parameter_type
returns None
), and you don't need these two extra match statements here.
// the values | ||
match parameters { | ||
ruff_python_ast::Expr::Subscript(inner_literal_subscript) => { | ||
let inner_subscript_value = self.infer_expression(&inner_literal_subscript.value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't seem right to use infer_expression
here, it should be infer_type_expression
-- and then you shouldn't have to repeat the recognition of Literal
below, or the invalid-literal-parameter error, infer_type_expression
should do all that for you? You just have to verify the type you get back is a literal type.
} | ||
Type::Tuple(TupleType::new(self.db, elts.into_boxed_slice())) | ||
} | ||
_ => self.infer_expression(parameters), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than matching out in infer_parameterized_known_instance_type_expression
on AST forms known not to be valid, I think we should explicitly match here on each AST form known to be valid, and directly return the right type, without relying on self.infer_expression
.
Co-authored-by: Carl Meyer <[email protected]>
Co-authored-by: Carl Meyer <[email protected]>
Co-authored-by: Carl Meyer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple more comments on top of Carl's
let elts = tuple.elements(self.db); | ||
Type::Union(UnionType::new(self.db, elts)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a bit of a footgun here in our current design, in that you should never really use UnionType::new()
directly, because it doesn't deduplicate the elements in the union. Instead you should always use UnionType::from_elements()
, which takes care of all the deduplication for you
self.add_diagnostic( | ||
parameters.into(), | ||
"invalid-literal-parameter", | ||
format_args!( | ||
"Type arguments for `Literal` must be None, a literal value (int, bool, str, or bytes), or an enum value", | ||
), | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the formatting here is somewhat skew-whiff (I think cargo fmt
is failing to spot it because of the macro, unfortunately)
self.add_diagnostic( | |
parameters.into(), | |
"invalid-literal-parameter", | |
format_args!( | |
"Type arguments for `Literal` must be None, a literal value (int, bool, str, or bytes), or an enum value", | |
), | |
); | |
self.add_diagnostic( | |
parameters.into(), | |
"invalid-literal-parameter", | |
format_args!( | |
"Type arguments for `Literal` must be None, a literal value (int, bool, str, or bytes), or an enum value", | |
), | |
); |
ruff_python_ast::Expr::Tuple(t) => { | ||
let mut elts = vec![]; | ||
for elm in &t.elts { | ||
elts.push(self.infer_literal_parameter_type(elm)); | ||
} | ||
Type::Tuple(TupleType::new(self.db, elts.into_boxed_slice())) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ruff_python_ast::Expr::Tuple(t) => { | |
let mut elts = vec![]; | |
for elm in &t.elts { | |
elts.push(self.infer_literal_parameter_type(elm)); | |
} | |
Type::Tuple(TupleType::new(self.db, elts.into_boxed_slice())) | |
} | |
ruff_python_ast::Expr::Tuple(t) => { | |
let elements: Box<_> = t.iter().map(|elt| self.infer_literal_parameter_type(elt)).collect(); | |
Type::Tuple(TupleType::new(self.db, elements)) | |
} |
Co-authored-by: Alex Waygood <[email protected]>
…al.md Co-authored-by: Carl Meyer <[email protected]>
Handling
Literal
type in annotations.Resolves: #13672
Implementation
Since Literals are not a fully defined type in typeshed. I used a trick to figure out when a special form is a literal.
When we are inferring assignment types I am checking if the type of that assignment was resolved to typing.SpecialForm and the name of the target is
Literal
if that is the case then I am re creating a new instance type and set the known instance field toKnownInstance:Literal
.Why not defining a new type?
From this issue I learned that we want to resolve members to SpecialMethod class. So if we create a new instance here we can rely on the member resolving in that already exists.
Tests
https://typing.readthedocs.io/en/latest/spec/literal.html#equivalence-of-two-literals
Since the type of the value inside Literal is evaluated as a Literal(LiteralString, LiteralInt, ...) then the equality is only true when types and value are equal.
https://typing.readthedocs.io/en/latest/spec/literal.html#legal-and-illegal-parameterizations
The illegal parameterizations are mostly implemented I'm currently checking the slice expression and the slice type to make sure it's valid.
Not covered:
Literal["foo".replace("o", "b")]
because I cannot fully disable attribute expressions in the Literal since enum members are allowed.Literal["w", "r"]
for example.The union creation with Literals is not working because I saw comments about Union not implemented yet.
https://typing.readthedocs.io/en/latest/spec/literal.html#shortening-unions-of-literals
Summary
Test Plan