-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate Cpp namespace when import Cpp is used #4873
base: trunk
Are you sure you want to change the base?
Conversation
Also defined a dedicated `ImportCppDecl` `InstKind`.
toolchain/check/check_unit.cpp
Outdated
@@ -340,17 +345,22 @@ auto CheckUnit::ImportCppPackages() -> void { | |||
return; | |||
} | |||
|
|||
IdentifierId package_id = imports.front().names.package_id; | |||
for (size_t i = 1; i < imports.size(); ++i) { | |||
CARBON_CHECK(imports[i].names.package_id == package_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nicer to use all
here instead of a for loop? If you like:
IdentifierId package_id = imports.front().names.package_id;
CARBON_CHECK(
llvm::all(imports, [&](const CppImport& import) { return import.names.package_id == package_id; }));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
toolchain/check/import_cpp.cpp
Outdated
first_import_decl_id, SemIR::AccessKind::Public); | ||
CARBON_CHECK(inserted); | ||
|
||
CARBON_CHECK(first_import_decl_id.has_value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetLocId() checks the same thing (via >= 0
) so I think this isn't needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
toolchain/check/import_cpp.cpp
Outdated
auto import_loc_id = context.insts().GetLocId(first_import_decl_id); | ||
|
||
auto namespace_inst = SemIR::Namespace{ | ||
namespace_type_id, SemIR::NameScopeId::None, first_import_decl_id}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we'd like the fields to be named in here:
carbon-lang/docs/project/cpp_style_guide.md
Line 132 in 133717c
- Use designated initializers (`{.a = 1}`) when possible for structs, |
(I see import.cpp didn't name them here, but does in other cases, probably an oversight/inconsistency there.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
toolchain/check/import_cpp.cpp
Outdated
namespace_id, name_id, SemIR::NameScopeId::Package); | ||
context.ReplaceInstBeforeConstantUse(namespace_id, namespace_inst); | ||
|
||
// Note we have to get the parent scope freshly, creating the imported |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should just avoid the local variable entirely? It's only used once each time. And leave this comment around explaining why we don't hold a local variable instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// CHECK:STDOUT: package: <namespace> = namespace [template] { | ||
// CHECK:STDOUT: .Cpp = imports.%Cpp | ||
// CHECK:STDOUT: } | ||
// CHECK:STDOUT: import_cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks different from the regular import semir. For example, stripped down from toolchain/check/testdata/alias/fail_builtins.carbon
:
// CHECK:STDOUT: imports {
// CHECK:STDOUT: %Core: <namespace> = namespace file.%Core.import, [template] {
// CHECK:STDOUT: .Int = %Core.Int
// CHECK:STDOUT: }
// CHECK:STDOUT: }
// CHECK:STDOUT:
// CHECK:STDOUT: file {
// CHECK:STDOUT: package: <namespace> = namespace [template] {
// CHECK:STDOUT: .Core = imports.%Core
// CHECK:STDOUT: }
// CHECK:STDOUT: %Core.import = import Core
// CHECK:STDOUT: }
a) The file
name is %Core.import
but here we have import_cpp
. When there's more than one import_cpp
they all have the same name, and there's no %
attached to it.
b) %Core.import
is = import Core
but there's nothing attached to import_cpp
c) In the imports
block, the %Core
is set to namespace file.%Core.import
which matches the name we see in the file
block. But for cpp we have %Cpp
set to namespace file.%cpp_import
which is not the same as import_cpp
in file
.
How do we make this more consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question.
I think the main difference is that when we import Cpp files, we generate a header that includes all the imported Cpp headers and process the generated header, so all Cpp imports are actually merged.
a) import
is replaced by import_cpp
as expected. There is no specific package name because all import_cpp
are for Cpp
.
b) That's because of the generated file that merges all import Cpp
s.
c) I've replaced cpp_import
with import_cpp
, so it matches better.
That said, I'm looking for suggestions what should the output look like given that we merge all import Cpp
together. Perhaps some improvements can be done to this PR and more complicated ones can be done in follow ups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ImportDecl
and ImportCppDecl
have separate printing behavior. When you see %Core.import = import Core
, it's declaring an ImportDecl
with Core
as an argument, and assigning a name as %Core.import
. Right now CppImport
doesn't have any arguments, so nothing on the RHS should be expected.
@bricknerb In formatter.cpp
, if you copy auto FormatInstLHS(InstId inst_id, ImportDecl /*inst*/) -> void
to add a similar overload for ImportCppDecl
, what's the result? I think you might want to aim for something like %Cpp = import_cpp
, which is how we indicate the instruction has a name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Jon!
Added a simple overload and we now have
%import_cpp = import_cpp
…has an equivalent check
toolchain/check/import_cpp.cpp
Outdated
@@ -118,10 +121,11 @@ static auto AddNamespace(Context& context, IdentifierId cpp_package_id, | |||
|
|||
// Note we have to get the parent scope freshly, creating the imported | |||
// namespace may invalidate the pointer above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// namespace may invalidate the pointer above. | |
// namespace may invalidate any pointers into name_scopes(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -26,7 +26,7 @@ import Cpp library "file2.h"; | |||
// CHECK:STDOUT: --- multiple_imports.carbon | |||
// CHECK:STDOUT: | |||
// CHECK:STDOUT: imports { | |||
// CHECK:STDOUT: %Cpp: <namespace> = namespace file.%cpp_import.loc4, [template] {} | |||
// CHECK:STDOUT: %Cpp: <namespace> = namespace file.%import_cpp.loc4, [template] {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The location is interesting, considering they are all being merged as you said. I'm not sure what that will do for diagnostics, if it will be hard to know which import is bringing in a clang error? Or maybe that is clear in diagnostics but we just don't have them all listed in semir? Then that might make tooling difficult though.
I think this is okay for now, but maybe we can think about improvements if this is a pain. Perhaps we could have multiple %Cpp imports and they'd get .xyz
suffixes in semir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how all imported namespaces behave -- C++ isn't any different here. i.e., you should expect the same result from:
import Foo library "bar";
import Foo library "baz";
The name Foo
can have only one location associated with it.
When it comes to specific names, we can have more specific locations. We probably won't have that short-term for C++ due to complexities involves, and how #include
semantics need to take priority. But a single name -- whether Cpp
, Foo
, or something else -- has a single location in SemIR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to understand the instruction model choice here, maybe you can fill out some detail for me?
At present, it looks like each import Cpp
line gets its own instruction. Then, these are combined into a single namespace. The import appears to only be associated with the first import Cpp
, is that correct?
Had you considered instead having the Cpp
namespace backed by a single instruction, and having that linked to a structure that stores the information about the Cpp
data as part of SemIR::File
?
Some of the significance for me is in how name references work. I'd expect SemIR for something like Cpp.printf
to read a bit like:
%import_cpp = import_cpp
%printf.ref: type = name_ref printf, %import_cpp.printf [template = constants.%printf]
Note here that only one ImportCppDecl
can actually end up in name lookup, and this is akin to how we use ImportDecl
when there are multiple imports from the same package: the instruction's argument is a package, and internally it decides how to split name references across libraries.
@@ -26,7 +26,7 @@ import Cpp library "file2.h"; | |||
// CHECK:STDOUT: --- multiple_imports.carbon | |||
// CHECK:STDOUT: | |||
// CHECK:STDOUT: imports { | |||
// CHECK:STDOUT: %Cpp: <namespace> = namespace file.%cpp_import.loc4, [template] {} | |||
// CHECK:STDOUT: %Cpp: <namespace> = namespace file.%import_cpp.loc4, [template] {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how all imported namespaces behave -- C++ isn't any different here. i.e., you should expect the same result from:
import Foo library "bar";
import Foo library "baz";
The name Foo
can have only one location associated with it.
When it comes to specific names, we can have more specific locations. We probably won't have that short-term for C++ due to complexities involves, and how #include
semantics need to take priority. But a single name -- whether Cpp
, Foo
, or something else -- has a single location in SemIR.
// CHECK:STDOUT: package: <namespace> = namespace [template] { | ||
// CHECK:STDOUT: .Cpp = imports.%Cpp | ||
// CHECK:STDOUT: } | ||
// CHECK:STDOUT: import_cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ImportDecl
and ImportCppDecl
have separate printing behavior. When you see %Core.import = import Core
, it's declaring an ImportDecl
with Core
as an argument, and assigning a name as %Core.import
. Right now CppImport
doesn't have any arguments, so nothing on the RHS should be expected.
@bricknerb In formatter.cpp
, if you copy auto FormatInstLHS(InstId inst_id, ImportDecl /*inst*/) -> void
to add a similar overload for ImportCppDecl
, what's the result? I think you might want to aim for something like %Cpp = import_cpp
, which is how we indicate the instruction has a name.
Note, to try to build some contrast for how imports from a package work...
We get the detail of the imported libraries inside the import data:
At the file level, they're combined:
I think here the |
BTW, to correct this and connect to the cross-package example... I think a good goal would be something like:
|
toolchain/check/import_cpp.cpp
Outdated
static auto AddNamespace(Context& context, IdentifierId cpp_package_id, | ||
SemIR::InstId first_import_decl_id) -> void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had you tried something like using import.cpp
's AddNamespace
, i.e.:
AddNamespace(context, namespace_type_id, name_id, SemIR::NameScopeId::Package, /*diagnose_duplicate_namespace=*/true, [&] { return first_import_decl_id; });
What was the difference between that and what you're trying to do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it further, but I still believe you're right that there are a lot of similarities and that's why I have a TODO extract and deduplicate the common logic.
Once we agree on the functionality, I can do this TODO as a follow up PR or as part of this PR.
toolchain/sem_ir/inst_namer.cpp
Outdated
@@ -705,6 +705,10 @@ auto InstNamer::CollectNamesInBlock(ScopeId top_scope_id, | |||
add_inst_name(out.TakeStr()); | |||
continue; | |||
} | |||
case ImportCppDecl::Kind: { | |||
add_inst_name("import_cpp"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add_inst_name("import_cpp"); | |
add_inst_name("Cpp.import_cpp"); |
We're typically trying to reflect the identifier as part of instruction names so that the parallels with code are easier to see. How about this, which is close to the ImportDecl
format?
(noting this from the %import_cpp = import_cpp
reply, this should make it %Cpp.import_cpp = import_cpp
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
… one for the `Cpp` namespace. Create a new `ImportCpp` struct that contains information per `import Cpp`, and store all instances in `File`. Format `ImportCppDecl` by printing all of the file's `ImportCpp`. Name `ImportCppDecl` as `"Cpp.import_cpp"` instead of `"import_cpp"`.
Thanks Jon! I've changed the logic to have a single |
Also defined a dedicated
ImportCppDecl
InstKind
.Part of #4666