Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Parse::Node, add ElementIndex in docs for typed insts #4604

Merged
merged 7 commits into from
Dec 3, 2024

Conversation

danakj
Copy link
Contributor

@danakj danakj commented Nov 28, 2024

The documentation still referred to typed instructions having a Parse::Node field, however that was removed and moved to the InstStore in f197219.

Then GetParseNode() was renamed to GetNodeId() in 86a7c9f and then GetLocationId() in b079acd and finally GetLocId() in b5d28f2.

The comment in typed_inst.h mentions only three fields now, but some types still have four, thanks to the unmentioned ElementIndex index field. Normally this field comes last, after the [...]Id fields except for in one case, AssociatedEntity. Rather than write ambiguously ordered documentation, update the comment to and docs to say that the ElementIndex comes last, and move it to the last position in AssociatedEntity. Tests are rebased accordingly.

@danakj danakj requested a review from zygoloid November 28, 2024 21:28
@github-actions github-actions bot requested a review from josh11b November 28, 2024 21:29
@danakj danakj removed the request for review from josh11b November 28, 2024 21:30
@danakj
Copy link
Contributor Author

danakj commented Nov 28, 2024

I know there's some StructReflection which in theory could be made to do the wrong thing by changing field orders, but I couldn't find anything like that affected here, but just in case pointing it out?

@@ -96,10 +92,21 @@ instruction, as described in
- Up to two additional, kind-specific members. For example `SemIR::Assign` has
members `InstId lhs_id` and `InstId rhs_id`.

- Optionally, an `ElementIndex index` field for instructions that refer to a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the kind-specific members described in the previous bullet, rather than a distinct special case. I don't think it's worth calling out as a special case here, but maybe the previous bullet could mention there are various different valid types for the kind-specific members?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've combined these two bullet points together, using them as two examples.

@@ -21,6 +21,8 @@
// value. This includes instructions that produce an abstract value, such as a
// `Namespace`, for which a placeholder type should be used.
// - Up to two `[...]Id` members describing the contents of the struct.
Copy link
Contributor

@zygoloid zygoloid Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should say: up to two members of types derived from IdBase rather than inaccurately saying `[...]Id`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though even that's not quite right. IntId and BuiltinInstKind are the exceptions. The real rule is that they need to be listed in the IdKind enum and implement the FromRaw/ToRaw protocol in inst.h.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, also there's one type with three IdBase members:

struct AnyAggregateAccess {
  static constexpr InstKind Kinds[] = {
      InstKind::StructAccess, InstKind::TupleAccess,
      InstKind::ClassElementAccess, InstKind::InterfaceWitnessAccess};

  InstKind kind;
  TypeId type_id;
  InstId aggregate_id;
  ElementIndex index;
};```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah but one is type_id, okay nvm. This one threw me off but that was because I forgot the type_id rule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more observations though:

  • AnyFoundationDecl and AnyBranch have an int32_t member, rather than something like an ElementIndex. ToRaw(int32_t) does not appear to be defined.
  • While IntId is, BuiltinInstKind is not in the IdKind enum. But the latter is also not a field in any typed inst. Rather than having a BuiltinInst typed inst, we now have a separate struct for each one, introduced in be5db6e.
  • The idea that builtins don't have a location id comes from the old text around SemIR::Builtin, a type that no longer exists. I can't find a reason why this would be true in the code, the parsing adds a TokenIndex at the location of a builtin instruction that should come with a source location. I am guessing this was maybe referring to the definition of builtins? And that this just no longer exists, as they are proper builtin instructions now? As such I think I would remove that clause and just say all instructions come with a location?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I understand the first two points now, and I think we can make this more explicit in the code and do some cleanup as a result, which I've done in #4606

@danakj danakj requested a review from zygoloid November 29, 2024 14:36
The documentation still referred to typed instructions having a
Parse::Node field, however that was removed and moved to the InstStore
in f197219.

Then GetParseNode() was renamed to GetNodeId() in 86a7c9f and
then GetLocationId() in b079acd and finally GetLocId() in
b5d28f2.

The comment in typed_inst.h and docs in check.md don't fully explain
what can be used as a field type in the typed instruction structs.
Update the docs to explain that any type in `SemIR::IdKind` can be used
as a field. This includes ElementIndex, which does not end in an `...Id`
suffix like the current documented rule requires.
danakj added a commit to danakj/carbon-lang that referenced this pull request Nov 29, 2024
The Any[...] instruction group structs have a field layout that matches
the specific typed instructions that it groups together. However those
specific instructions may have different field types, or different
numbers of fields, making it impossible for the Any[...] instruction
group to match them all at once.

In this case, it can use the AnyRawId field type to represent that the
specific typed instructions have different field types in that position,
or may not have a field at all.

Previously we used `int32_t` as this polymorphic placeholder type, which
was implicit and worked somewhat magically:
 - It was not listed in IdKind
 - It was was not implemented explicitly for either FromRaw or ToRaw

However it happened to work because:
 - FromRaw<T> was implemented for every T, and would build as long as T
   was constructible from the raw id value, which is int32_t.
 - In As<AnyGroupInstruction> for converting a specific instruction to a
   group instruction: For any given id field type T in the specific type
   struct, it would be converted to its raw (int32_t) value, then the
   AnyGroupInstruction field would be constructed with FromRaw<int32_t>
   since the polymorphic field type was int32_t.
 - Of course int32_t is constructible from int32_t.
 - And if the specific instruction did not have a field in the matching
   position, the AnyGroupInstruction's field would be default
   constructed, and int32_t is default constructible.

This allowed As<AnyGroupInstruction> to construct the AnyGroupInstruction type
from each of its specific instruction types regardless of what field
types they had.

We can make this more explicit by using a type other than int32_t. We
introduce the AnyRawId specifically for the purpose of being a
polymorphic field type in Any[...] instruction groups.

The AnyRawId type _is_ part of IdKind, removing the need for a special
case in the documentation and rules about field types. And this allows
us to `require` that T is in IdKind for FromRaw<T>.
 - This also pointed out that FromRaw and ToRaw do not need to be
   implemented for BuiltinTypeKind anymore, as this is no longer a field
   type in any typed instructions, further simplifying the rules to not
   need any exceptions for what is a valid field type.

The AnyRawId type participates in FromRaw by being a member of IdKind
and being constructible from int32_t.

The AnyRawId type is default constuctible so that when the specific
typed instruction has no field in the matching position, it will be
default-constructed with the InstId::InvalidIndex value.

The AnyRawId type does _not_ participate in ToRaw, and this is
documented on the type. This is because conversion from specific typed
instruction to an instruction group is lossy (due to the polymorphism
which requires the AnyRawId type). As such conversion from the
instruction group to a specific instruction is not possible, and thus
the Any[...] instruction group does not need to support being converted
to raw id values.

This is rebased on carbon-language#4604.
danakj added a commit to danakj/carbon-lang that referenced this pull request Dec 3, 2024
The Any[...] instruction group structs have a field layout that matches
the specific typed instructions that it groups together. However those
specific instructions may have different field types, or different
numbers of fields, making it impossible for the Any[...] instruction
group to match them all at once.

In this case, it can use the AnyRawId field type to represent that the
specific typed instructions have different field types in that position,
or may not have a field at all.

Previously we used `int32_t` as this polymorphic placeholder type, which
was implicit and worked somewhat magically:
 - It was not listed in IdKind
 - It was was not implemented explicitly for either FromRaw or ToRaw

However it happened to work because:
 - FromRaw<T> was implemented for every T, and would build as long as T
   was constructible from the raw id value, which is int32_t.
 - In As<AnyGroupInstruction> for converting a specific instruction to a
   group instruction: For any given id field type T in the specific type
   struct, it would be converted to its raw (int32_t) value, then the
   AnyGroupInstruction field would be constructed with FromRaw<int32_t>
   since the polymorphic field type was int32_t.
 - Of course int32_t is constructible from int32_t.
 - And if the specific instruction did not have a field in the matching
   position, the AnyGroupInstruction's field would be default
   constructed, and int32_t is default constructible.

This allowed As<AnyGroupInstruction> to construct the AnyGroupInstruction type
from each of its specific instruction types regardless of what field
types they had.

We can make this more explicit by using a type other than int32_t. We
introduce the AnyRawId specifically for the purpose of being a
polymorphic field type in Any[...] instruction groups.

The AnyRawId type _is_ part of IdKind, removing the need for a special
case in the documentation and rules about field types. And this allows
us to `require` that T is in IdKind for FromRaw<T>.
 - This also pointed out that FromRaw and ToRaw do not need to be
   implemented for BuiltinTypeKind anymore, as this is no longer a field
   type in any typed instructions, further simplifying the rules to not
   need any exceptions for what is a valid field type.

The AnyRawId type participates in FromRaw by being a member of IdKind
and being constructible from int32_t.

The AnyRawId type is default constuctible so that when the specific
typed instruction has no field in the matching position, it will be
default-constructed with the InstId::InvalidIndex value.

The AnyRawId type participates in ToRaw, though this is currently
unused.

This is rebased on carbon-language#4604.
toolchain/docs/check.md Outdated Show resolved Hide resolved
Co-authored-by: Richard Smith <[email protected]>
@danakj danakj enabled auto-merge December 3, 2024 19:33
@danakj
Copy link
Contributor Author

danakj commented Dec 3, 2024

Thanks!

Merged via the queue into carbon-language:trunk with commit d434828 Dec 3, 2024
8 checks passed
@danakj danakj deleted the typed-inst-fields branch December 3, 2024 20:18
danakj added a commit to danakj/carbon-lang that referenced this pull request Dec 3, 2024
The Any[...] instruction group structs have a field layout that matches
the specific typed instructions that it groups together. However those
specific instructions may have different field types, or different
numbers of fields, making it impossible for the Any[...] instruction
group to match them all at once.

In this case, it can use the AnyRawId field type to represent that the
specific typed instructions have different field types in that position,
or may not have a field at all.

Previously we used `int32_t` as this polymorphic placeholder type, which
was implicit and worked somewhat magically:
 - It was not listed in IdKind
 - It was was not implemented explicitly for either FromRaw or ToRaw

However it happened to work because:
 - FromRaw<T> was implemented for every T, and would build as long as T
   was constructible from the raw id value, which is int32_t.
 - In As<AnyGroupInstruction> for converting a specific instruction to a
   group instruction: For any given id field type T in the specific type
   struct, it would be converted to its raw (int32_t) value, then the
   AnyGroupInstruction field would be constructed with FromRaw<int32_t>
   since the polymorphic field type was int32_t.
 - Of course int32_t is constructible from int32_t.
 - And if the specific instruction did not have a field in the matching
   position, the AnyGroupInstruction's field would be default
   constructed, and int32_t is default constructible.

This allowed As<AnyGroupInstruction> to construct the AnyGroupInstruction type
from each of its specific instruction types regardless of what field
types they had.

We can make this more explicit by using a type other than int32_t. We
introduce the AnyRawId specifically for the purpose of being a
polymorphic field type in Any[...] instruction groups.

The AnyRawId type _is_ part of IdKind, removing the need for a special
case in the documentation and rules about field types. And this allows
us to `require` that T is in IdKind for FromRaw<T>.
 - This also pointed out that FromRaw and ToRaw do not need to be
   implemented for BuiltinTypeKind anymore, as this is no longer a field
   type in any typed instructions, further simplifying the rules to not
   need any exceptions for what is a valid field type.

The AnyRawId type participates in FromRaw by being a member of IdKind
and being constructible from int32_t.

The AnyRawId type is default constuctible so that when the specific
typed instruction has no field in the matching position, it will be
default-constructed with the InstId::InvalidIndex value.

The AnyRawId type participates in ToRaw, though this is currently
unused.

This is rebased on carbon-language#4604.
github-merge-queue bot pushed a commit that referenced this pull request Dec 3, 2024
…4606)

The Any[...] instruction group structs have a field layout that matches
the specific typed instructions that it groups together. However those
specific instructions may have different field types, or different
numbers of fields, making it impossible for the Any[...] instruction
group to match them all at once.

In this case, it can use the AnyRawId field type to represent that the
specific typed instructions have different field types in that position,
or may not have a field at all.
    
Previously we used `int32_t` as this polymorphic placeholder type, which
was implicit and worked somewhat magically:
 - It was not listed in IdKind
 - It was was not implemented explicitly for either FromRaw or ToRaw

However it happened to work because:
 - FromRaw<T> was implemented for every T, and would build as long as T
   was constructible from the raw id value, which is int32_t.
 - In As<AnyGroupInstruction> for converting a specific instruction to a
   group instruction: For any given id field type T in the specific type
   struct, it would be converted to its raw (int32_t) value, then the
   AnyGroupInstruction field would be constructed with FromRaw<int32_t>
   since the polymorphic field type was int32_t.
 - Of course int32_t is constructible from int32_t.
 - And if the specific instruction did not have a field in the matching
   position, the AnyGroupInstruction's field would be default
   constructed, and int32_t is default constructible.

This allowed As<AnyGroupInstruction> to construct the
AnyGroupInstruction type
from each of its specific instruction types regardless of what field
types they had.

We can make this more explicit by using a type other than int32_t. We
introduce the AnyRawId specifically for the purpose of being a
polymorphic field type in Any[...] instruction groups.

The AnyRawId type _is_ part of IdKind, removing the need for a special
case in the documentation and rules about field types. And this allows
us to `require` that T is in IdKind for FromRaw<T>.
 - This also pointed out that FromRaw and ToRaw do not need to be
   implemented for BuiltinTypeKind anymore, as this is no longer a field
   type in any typed instructions, further simplifying the rules to not
   need any exceptions for what is a valid field type.

The AnyRawId type participates in FromRaw by being a member of IdKind
and being constructible from int32_t.

The AnyRawId type is default constuctible so that when the specific
typed instruction has no field in the matching position, it will be
default-constructed with the InstId::InvalidIndex value.

The AnyRawId type does _not_ participate in ToRaw, and this is
documented on the type. This is because conversion from specific typed
instruction to an instruction group is lossy (due to the polymorphism
which requires the AnyRawId type). As such conversion from the
instruction group to a specific instruction is not possible, and thus
the Any[...] instruction group does not need to support being converted
to raw id values.

This is rebased on #4604.

---------

Co-authored-by: Richard Smith <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants