Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Literal special form #13874

Merged
merged 55 commits into from
Nov 5, 2024
Merged

[red-knot] Literal special form #13874

merged 55 commits into from
Nov 5, 2024

Conversation

Glyphack
Copy link
Contributor

@Glyphack Glyphack commented Oct 22, 2024

Handling Literal type in annotations.

Resolves: #13672

Implementation

Since Literals are not a fully defined type in typeshed. I used a trick to figure out when a special form is a literal.
When we are inferring assignment types I am checking if the type of that assignment was resolved to typing.SpecialForm and the name of the target is Literal if that is the case then I am re creating a new instance type and set the known instance field to KnownInstance:Literal.

Why not defining a new type?

From this issue I learned that we want to resolve members to SpecialMethod class. So if we create a new instance here we can rely on the member resolving in that already exists.

Tests

https://typing.readthedocs.io/en/latest/spec/literal.html#equivalence-of-two-literals
Since the type of the value inside Literal is evaluated as a Literal(LiteralString, LiteralInt, ...) then the equality is only true when types and value are equal.

https://typing.readthedocs.io/en/latest/spec/literal.html#legal-and-illegal-parameterizations

The illegal parameterizations are mostly implemented I'm currently checking the slice expression and the slice type to make sure it's valid.

Not covered:

  1. I did not find an easy way to error on things like Literal["foo".replace("o", "b")] because I cannot fully disable attribute expressions in the Literal since enum members are allowed.
  2. parenthesized Tuples are not allowed. Although pyright allows this in the doc is stated that tuples containing valid literal types are illegal. Tuples are valid in case of Literal["w", "r"] for example.

The union creation with Literals is not working because I saw comments about Union not implemented yet.
https://typing.readthedocs.io/en/latest/spec/literal.html#shortening-unions-of-literals

Summary

Test Plan

@Glyphack Glyphack force-pushed the literal-type branch 4 times, most recently from 3fe84bd to fbcc66c Compare October 22, 2024 22:13
@Glyphack Glyphack marked this pull request as ready for review October 22, 2024 22:26
Copy link
Contributor

github-actions bot commented Oct 22, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! This task is actually quite a bit harder than I made it sound, I was forgetting some of the complexity in recognizing special forms :) This is a really good initial effort. Let me know if any of the comments below don't make sense or need further clarification.

@@ -1130,6 +1132,7 @@ impl<'db> KnownClass {
Self::ModuleType => "ModuleType",
Self::FunctionType => "FunctionType",
Self::NoneType => "NoneType",
Self::SpecialForm => "SpecialForm",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this should match the actual name of the symbol in the typing module?

Suggested change
Self::SpecialForm => "SpecialForm",
Self::SpecialForm => "_SpecialForm",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Un-resolved this comment, because it doesn't look addressed.) Is there a reason this needs to stay "SpecialForm" and not match the actual name in the module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No sorry I accidentally force pushed.

let annotation_ty = self.infer_annotation_expression(annotation);
let mut annotation_ty = self.infer_annotation_expression(annotation);

// If the variable is annotation with SpecialForm then create a new class with name of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If the variable is annotation with SpecialForm then create a new class with name of the
// If the variable is annotated with SpecialForm then create a new class with name of the

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment also doesn't look resolved?

self.infer_subscript_expression(subscript);
Type::Todo
}
ast::Expr::Subscript(subscript) => self.infer_subscript_expression(subscript),
Copy link
Contributor

@carljm carljm Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is where we need the code to recognize special forms, but we should not be falling back to infer_subscript_expression here (that's for value expressions), instead we should have a dedicated infer_subscript_type_expression method, which should use infer_type_expression on the value and the index, and for now handle only the case where the value is typing.Literal special form, otherwise just return Todo.

(The fact that infer_subscript_expression was previously called here was just an easy placeholder way to ensure we cover all the sub-expressions, until we added proper support for inferring types correctly in type expressions.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Just one question, should we use infer_type_expression for all indexes?
For example Literal in here is defined with an expression in the index. Others have annotation_expression in their grammar. So here I use infer_type_expression for other things but if it's Literal I use infer_expression.

My reasoning behind it was when the value is True. The True itself should not have any meaning when used alone in the type annotation.

@carljm carljm added the red-knot Multi-file analysis & type inference label Oct 23, 2024
@Glyphack Glyphack force-pushed the literal-type branch 2 times, most recently from bb69a4c to a5a4f7f Compare October 27, 2024 17:07
Comment on lines 76 to 78
/// Lookup the type of `symbol` in the `_typeshed` module namespace.
///
/// Returns `Unbound` if the `_typeshed` module isn't available for some reason.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Lookup the type of `symbol` in the `_typeshed` module namespace.
///
/// Returns `Unbound` if the `_typeshed` module isn't available for some reason.
/// Lookup the type of `symbol` in the `typing` module namespace.
///
/// Returns `Unbound` if the `typing` module isn't available for some reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was marked resolved, but it looks like it is still relevant and not addressed yet? I un-resolved it.

Copy link
Member

@AlexWaygood AlexWaygood Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC @Glyphack did apply this suggestion using the GitHub web UI... Possibly it was accidentally lost in a force-push following that @Glyphack? :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I'm sorry about that it caused a lot of resolved ones to be unresolved again. I will keep the comments open so I check them again before requesing review.

@Glyphack
Copy link
Contributor Author

Glyphack commented Oct 27, 2024

I applied the comments will spend another day on adding diagnostic messages for https://typing.readthedocs.io/en/latest/spec/literal.html#legal-and-illegal-parameterizations

@Glyphack
Copy link
Contributor Author

Glyphack commented Oct 28, 2024

Okay I added more parts of the legal and illegal parameters from the spec. Right now we have:

  1. All Literal types I managed to do this through a lot of recursion.
  2. Nested Literals

I think the remaining part is assignability check. Right now the Literals are unwrapped to their inner type I don't think this is the right way, is it? It works in the tests but I'm not sure if Literal types should carry some special flags with themselves.

I did not find an easy way to error on things like Literal["foo".replace("o", "b")] because I cannot fully disable attribute expressions in the Literal since enum members are allowed.
I can do the same I did on nested literals:

  1. Check if it's attribute
  2. Check the type of value and if it's a class and has Enum in bases otherwise emit diagnostic.

Does this sound good?

parenthesized Tuples are not allowed. Although pyright allows this in the doc is stated that tuples containing valid literal types are illegal. Tuples are valid in case of Literal["w", "r"] for example.

Also I'm not correctly joining union when it's possible. I left it as a todo in the tests:

    # TODO: revealed: Literal[1, 2, 3, "foo", 5] | None
    reveal_type(union_var)  # revealed: Literal[1, 2, 3] | Literal["foo"] | Literal[5] | None

Please let me know what do you think.

@Glyphack Glyphack changed the title Add Literal special form to types Literal special form Oct 28, 2024
@Glyphack Glyphack changed the title Literal special form [red-knot] Literal special form Oct 28, 2024
Comment on lines +1945 to +1954
pub fn anonymous(class: ClassType<'db>) -> Self {
Self { class, known: None }
}

pub fn known(class: ClassType<'db>, known: KnownInstance) -> Self {
Self {
class,
known: Some(known),
}
}
Copy link
Member

@AlexWaygood AlexWaygood Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the known method here needs to have a different name, or it conflicts with the name of the field on the class. Rust lets you do that, but I find it pretty confusing to have a method the same name as a field that does quite a different thing 😆

How about new_known() and new_unknown for the two constructors? That would also have the nice property of making it obvious that they are parallel to each other

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I quite like the names as they are :/ There's no actual conflict; the method is an associated method, so it can't be called on an instance of the struct, only as InstanceType::known, and the field can only be accessed on an instance (self.known). And they are closely related and refer to the same thing, so it's not like we have a terminology clash that's confusing two unrelated concepts. It makes sense to me to have a constructor named known that creates an instance of the struct with its optional known field set to Some(...).

I feel like avoiding this is maybe a Python habit, where it really would be a problem.

Using new_* for alternate constructors feels unwieldy and unusual. The Default trait doesn't define a new_default method, it defines a default method. And I feel that new_unknown is too close to the Unknown type. I prefer "anonymous" over "unknown" as the opposite to a "known" instance or class.

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this looks great!! A few nitpicks, but nothing serious. There's a new merge conflict (sorry!) resulting from the recent removal of Type::None in favour of Type::Instance(_typeshed.NoneType), but it shouldn't be too hard to fix, I don't think

Glyphack and others added 4 commits November 4, 2024 10:17
* main: (39 commits)
  Also remove trailing comma while fixing C409 and C419 (astral-sh#14097)
  Re-enable clippy `useless-format` (astral-sh#14095)
  Derive message formats macro support to string (astral-sh#14093)
  Avoid cloning `Name` when looking up function and class types (astral-sh#14092)
  Replace `format!` without parameters with `.to_string()` (astral-sh#14090)
  [red-knot] Do not panic when encountering string annotations (astral-sh#14091)
  [red-knot] Add MRO resolution for classes (astral-sh#14027)
  [red-knot] Remove `Type::None` (astral-sh#14024)
  Cached inference of all definitions in an unpacking (astral-sh#13979)
  Update dependency uuid to v11 (astral-sh#14084)
  Update Rust crate notify to v7 (astral-sh#14083)
  Update cloudflare/wrangler-action action to v3.11.0 (astral-sh#14080)
  Update dependency mdformat-mkdocs to v3.1.1 (astral-sh#14081)
  Update pre-commit dependencies (astral-sh#14082)
  Update dependency ruff to v0.7.2 (astral-sh#14077)
  Update NPM Development dependencies (astral-sh#14078)
  Update Rust crate thiserror to v1.0.67 (astral-sh#14076)
  Update Rust crate syn to v2.0.87 (astral-sh#14075)
  Update Rust crate serde to v1.0.214 (astral-sh#14074)
  Update Rust crate pep440_rs to v0.7.2 (astral-sh#14073)
  ...
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were some things about the behavior of this PR that confused me, so I went ahead and merged in main and did the merge conflict resolution so I could explore those things. Since I did the work, going ahead and pushing it (with a few other small changes) to make sure we don't duplicate work.

Now continuing to review the rest...

a1: Literal[26]

def f():
reveal_type(a1) # revealed: @Todo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I merged in main and this changed to Literal[26], which is what it should be. I'm not sure why the merge from main fixed it? Maybe I fixed some logic in merge conflict resolution.

Comment on lines 61 to 69
```py
from typing import _SpecialForm

Literal: _SpecialForm
a1: Literal[26]

def f():
reveal_type(a1) # revealed: @Todo
```
Copy link
Contributor

@carljm carljm Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test was passing for the wrong reason (for the same reason the below test wrongly had @Todo).

The reason this test was behaving oddly turned out to be pretty subtle. Locally within the same scope, we don't consider an annotated assignment with no RHS to actually define the name (because, well, at runtime it doesn't.) So what was actually happening here on the a1: Literal[26] line is that we failed to find a value for Literal in the same scope, and we looked it up in builtins, and we found it there! Imported from typing.Literal, because builtins.pyi actually does import Literal from typing, and we haven't yet implemented the re-export rules for imports that would tell us to ignore it: #14099 So currently we wrongly treat typing.Literal as a built-in :)

I fixed this test to create its "custom" Literal in a separate type stub and import it from there, which makes the test work as expected.

* main:
  [red-knot] Implement type narrowing for boolean conditionals (astral-sh#14037)
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!! The fixes needed were minor, and I really want to get this merged now to avoid more conflict resolution needed tomorrow, so I went ahead and made the small updates and am going to go ahead and merge it. If anyone has review comments on the bits that I'm landing unreviewed (or on the InstanceType::known naming question), one of us can put up a small follow-up PR tomorrow to take care of them.

Thank you @Glyphack for all your work on this! It definitely turned out to be more complex than I'd initially realized :)

Comment on lines +1945 to +1954
pub fn anonymous(class: ClassType<'db>) -> Self {
Self { class, known: None }
}

pub fn known(class: ClassType<'db>, known: KnownInstance) -> Self {
Self {
class,
known: Some(known),
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I quite like the names as they are :/ There's no actual conflict; the method is an associated method, so it can't be called on an instance of the struct, only as InstanceType::known, and the field can only be accessed on an instance (self.known). And they are closely related and refer to the same thing, so it's not like we have a terminology clash that's confusing two unrelated concepts. It makes sense to me to have a constructor named known that creates an instance of the struct with its optional known field set to Some(...).

I feel like avoiding this is maybe a Python habit, where it really would be a problem.

Using new_* for alternate constructors feels unwieldy and unusual. The Default trait doesn't define a new_default method, it defines a default method. And I feel that new_unknown is too close to the Unknown type. I prefer "anonymous" over "unknown" as the opposite to a "known" instance or class.

@carljm carljm enabled auto-merge (squash) November 5, 2024 01:41
@carljm carljm merged commit 9dddd73 into astral-sh:main Nov 5, 2024
17 checks passed
@Glyphack Glyphack deleted the literal-type branch November 5, 2024 04:36
@Glyphack
Copy link
Contributor Author

Glyphack commented Nov 5, 2024

Thanks for your time and guidance, I appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[red-knot] understand the Literal[] special form in annotations
3 participants