From d434828e21729782d3f9c05056f477f3cbf2ce21 Mon Sep 17 00:00:00 2001 From: Dana Jansens Date: Tue, 3 Dec 2024 14:44:54 -0500 Subject: [PATCH] Remove Parse::Node, add ElementIndex in docs for typed insts (#4604) The documentation still referred to typed instructions having a Parse::Node field, however that was removed and moved to the InstStore in f197219c107644. Then GetParseNode() was renamed to GetNodeId() in 86a7c9ff451f4cf and then GetLocationId() in b079acd86f00e593 and finally GetLocId() in b5d28f2c4b0fee46. The comment in typed_inst.h mentions only three fields now, but some types still have four, thanks to the unmentioned `ElementIndex index` field. Normally this field comes last, after the `[...]Id` fields except for in one case, AssociatedEntity. Rather than write ambiguously ordered documentation, update the comment to and docs to say that the ElementIndex comes last, and move it to the last position in AssociatedEntity. Tests are rebased accordingly. --------- Co-authored-by: Richard Smith --- toolchain/docs/check.md | 44 ++++++++++++++++++---------------- toolchain/sem_ir/id_kind.h | 5 ++++ toolchain/sem_ir/typed_insts.h | 15 +++++++----- 3 files changed, 38 insertions(+), 26 deletions(-) diff --git a/toolchain/docs/check.md b/toolchain/docs/check.md index 826a886aa173c..b6be6cac33154 100644 --- a/toolchain/docs/check.md +++ b/toolchain/docs/check.md @@ -84,37 +84,41 @@ instruction, as described in [sem_ir/typed_insts.h](/toolchain/sem_ir/typed_insts.h) (also see [adding features for Check](adding_features.md#check)): -- A `Parse::Node parse_node;` member that tracks its location is present on - almost all instructions, except instructions like `SemIR::Builtin` that - don't have an associated location. +- An `InstKind kind;` member if the instruction has a `Kinds` constant making + it a shorthand for multiple individual instructions. - A `SemIR::TypeId type_id;` member that describes the type of the instruction is present on all instructions that produce a value. This includes namespace instructions, which are modeled as producing a value of "namespace" type, even though they can't be used as a first-class value in Carbon expressions. -- Up to two additional, kind-specific members. For example `SemIR::Assign` has - members `InstId lhs_id` and `InstId rhs_id`. +- Up to two additional [kind-specific members](#instruction-operands). For + example `SemIR::Assign` has members `InstId lhs_id` and `InstId rhs_id`. And + `SemIR::ClassElementAccess` has an `ElementIndex` into the fields of the + class to define which field it is accessing. Instructions are stored as type-erased `SemIR::Inst` objects, which store the instruction kind and the (up to) four fields described above. This balances the -size of `SemIR::Inst` against the overhead of indirection. +size of `SemIR::Inst` against the overhead of indirection. All instructions have +with them a `LocId` that tracks their source location, and is stored separately +on the `InstStore` and retrieved by `GetLocId()`. -A `SemIR::InstBlock` can represent a code block. However, it can also be created -when a series of instructions needs to be closely associated, such as a -parameter list. +A `SemIR::InstBlockId`, used to index into `SemIR::InstBlockStore`, can +represent a code block. However, it can also be created when a series of +instructions needs to be closely associated, such as a parameter list. -A `SemIR::Builtin` represents a language built-in, such as the unconstrained -facet type `type`. We will also have built-in functions which would need to form -the implementation of some library types, such as `i32`. Built-ins are in a -stable index across `SemIR` instances. +A number of instruction types in +[sem_ir/typed_insts.h](/toolchain/sem_ir/typed_insts.h) are builtin +instructions, such as `SemIR::TypeType` which represents the unconstrained facet +type `type`. Builtins have stable ids in the `SemIR::InstStore` across `SemIR` +instances. ### Instruction operands The kind-specific members on a typed instruction struct can be of any type listed in the `SemIR::IdKind` enumeration defined in -[sem_ir/id_kind.h](/toolchain/sem_ir/id_kind.h). The most commonly used kinds -refer to other instructions: +[sem_ir/id_kind.h](/toolchain/sem_ir/id_kind.h), typically derived from +`IdBase`. The most commonly used kinds refer to other instructions: - `SemIR::InstId`: Refers to a specific instance of an instruction. For example, this should be used if the operand may have side-effects or a @@ -132,11 +136,11 @@ documentation for the ID type for more details. ### Parameters and arguments -Parameters and arguments will be stored as two `SemIR::InstBlock`s each. The -first will contain the full IR, while the second will contain references to the -last instruction for each parameter or argument. The references block will have -a size equal to the number of parameters or arguments, allowing for quick size -comparisons and indexed access. +Parameters and arguments will be stored as two blocks (referred to by +`SemIR::InstBlockId`) each. The first will contain the full IR, while the second +will contain references to the last instruction for each parameter or argument. +The references block will have a size equal to the number of parameters or +arguments, allowing for quick size comparisons and indexed access. ## SemIR textual format diff --git a/toolchain/sem_ir/id_kind.h b/toolchain/sem_ir/id_kind.h index 9bc70bbeffdc9..7e079a31cf809 100644 --- a/toolchain/sem_ir/id_kind.h +++ b/toolchain/sem_ir/id_kind.h @@ -115,6 +115,11 @@ class TypeEnum { }; // An enum of all the ID types used as instruction operands. +// +// As instruction operands, the types listed here can appear as fields of typed +// instructions (`toolchain/sem_ir/typed_insts.h`) and must implement the +// `FromRaw` and `ToRaw` protocol in `SemIR::Inst`. In most cases this is done +// by inheriting from `IdBase` or `IndexBase`. using IdKind = TypeEnum< // From base/value_store.h. IntId, RealId, FloatId, StringLiteralValueId, diff --git a/toolchain/sem_ir/typed_insts.h b/toolchain/sem_ir/typed_insts.h index 5ff47f5426b57..73469dd6e0341 100644 --- a/toolchain/sem_ir/typed_insts.h +++ b/toolchain/sem_ir/typed_insts.h @@ -20,7 +20,10 @@ // - Optionally, a `TypeId type_id;` member, for instructions that produce a // value. This includes instructions that produce an abstract value, such as a // `Namespace`, for which a placeholder type should be used. -// - Up to two `[...]Id` members describing the contents of the struct. +// - Up to two members describing the contents of the struct. These are types +// listed in the `SemIR::IdKind` type-enum, typically derived from `IdBase`. +// +// TODO: There are some `int32_t` fields, which breaks the above rule, how/why? // // The field names here matter -- the fields must have the names specified // above, when present. When converting to a `SemIR::Inst`, the `kind` and @@ -32,16 +35,16 @@ // enumeration. This `Kind` declaration also defines the instruction kind by // calling `InstKind::Define` and specifying additional information about the // instruction kind. This information is available through the member functions -// of the `InstKind` value declared in `inst_kind.h`, and includes the name -// used in textual IR and whether the instruction is a terminator instruction. +// of the `InstKind` value declared in `inst_kind.h`, and includes the name used +// in textual IR and whether the instruction is a terminator instruction. // // Struct types can also be provided for categories of instructions with a // common representation, to allow the common representation to be accessed // conveniently. In this case, instead of providing a constant `Kind` member, // the struct should have a constant `InstKind Kinds[];` member that lists the -// kinds of instructions in the category, and an `InstKind kind;` member that -// is used to identify the specific kind of the instruction. Separate struct -// types still need to be defined for each instruction kind in the category. +// kinds of instructions in the category, and an `InstKind kind;` member that is +// used to identify the specific kind of the instruction. Separate struct types +// still need to be defined for each instruction kind in the category. namespace Carbon::SemIR {