Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CIR] Call code gen; create empty cir.func op #113483
[CIR] Call code gen; create empty cir.func op #113483
Changes from 3 commits
fd38921
88873f1
6930d7c
d8d9cce
6104178
a8f5444
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care about macros/etc here?
getLine
andgetColumn
I think end up being the location itself, but there is a lot of work you can do to un-fold macros.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am guessing that the MLIR location type doesn't have any notion of multiple nested locations due to macro expansions. So we can't represent the full complexity of macros here. The outermost location seems like the right one to choose. (I am not familiar with either the Clang location type or the MLIR location type, so I welcome being corrected here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave this open for Aaron to comment on. IIRC, the source location stored in the AST is going to be the location of the macro, which can make diagnostics awkward/confusing if what happens is inside the expansion, but if MLIR can't comprehend "in expansion of..." then this is perhaps as good as we can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may be okay -- a
PresumedLoc
is sensitive to#line
and GNU line markers, etc, and it's always the expansion point of the wrapped source location. However, MLIR may eventually need to grow to understand the difference between spelling, expansion, and presumed locations.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you ever want to generate diagnostics from CIR, you probably want to just use the clang SourceLocation directly, not try to translate it to an MLIR "Location". It's extremely complicated to describe a location when macros are involved, and the abstraction clang uses really only makes sense for C preprocessor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't buy the reasoning of "
mlir::Location
exists, therefore it's the right representation". We're losing information which is likely to be important for many users of CIR. Not saying this is something we need to solve right now, but it will eventually become problematic.Converting a SourceLocation to line/column is also sort of slow (we compute line numbers lazily), but not sure how slow relative to other stuff CIR lowering needs to do anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mlir::Location
is not the right representation because it exists. It's because it is used by all the other MLIR dialects that have nothing to do with C++ and cannot have a dependency on Clang. It is wanting ClangIR to play nicely with the rest of the MLIR ecosystem, when many dialects are mixed together in a single tree later in the compilation chain, that makes usingSourceLocation
problematic.Definitely. Using
SourceLocation
instead ofmlir::Location
is not something that can be done in this PR. We would need evidence that the lack ofSourceLocation
and the richness of what it can represent is a serious problem before attempting such a radical change.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, that's also an internal implementation detail that doesn't need to be exposed to users of the CIR library. Clang works in
SourceLocation
; if CIR needs to work in something else, that should be abstracted away from the user. Still not saying this needs to be solved as part of this patch, butCIRGenModule::getLoc()
are public interfaces currently; can those be made private so we're not leaking an implementation detail of CIR to users?Two things:
__attribute__((error))
come to mind) and this has been a huge source of pain because LLVM does a pretty bad job of tracking source location with sufficient fidelity for Clang to have good QoI. If MLIR has the same issues, that should be something the MLIR folks work to correct before we transition to CIR as a default.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The primary use of ClangIR is intended to be of the MLIR data structures after CIR code gen has happened. The public interface for ClangIR is entirely based on MLIR, including
mlir::Location
. Some users will want to use the CIR library to convert from the Clang AST to the ClangIR MLIR dialect, but for many users of ClangIR the fact that the MLIR data structures were generated from the Clang front end and the Clang AST is an implementation detail that they don't need to know about.SourceLocation
is more of an implementation detail thanmlir::Location
is.Changing
CIRGenModule::getLoc()
to be private functions would be problematic because those functions will be used by many different classes within the ClangIR code gen code. Those functions really should have module accessibility (similar to package or assembly accessibility in Java or C#), but that's not practical until Clang/LLVM fully moves to C++20 modules.I am aware and agree that (1) back end diagnostics should be avoided where possible and should be rare, but that (2) when they do happen they can be hard to get right. We struggle with that same problem with NVC++. We should solve the problem of getting good diagnostics during the MLIR passes when we actually run into the problem. It is only then that we will know how serious the problem is, how best to solve it, and how much effort to put into it. Solving that problem now is premature and will miss the mark in some way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That depends on perspective, IMO. From our perspective, CIR is the way in which we lower Clang's AST to an IR that eventually ends up in LLVM. The public entrypoints into CIR from Clang's perspective should be using Clang's data structures. If those need to be converted internally into MLIR-specific data structures, that's fine, but that should not leak out into the public interfaces that Clang can interact with.
However, I see now that I missed something important here -- this code is under
clang/lib/CIR
and notclang/include/clang/CIR
, so it isn't in the public interfaces that Clang interacts with, it is (sufficiently) hidden as an implementation detail. Sorry for missing that!The point to these initial PRs is to ensure the community is comfortable with the design and doesn't see any glaring design concerns, so saying "we'll solve it when we get to it" on a situation we know we will hit sort of defeats the purpose. We already know the problem exists because it's one we fight with today and we already know a better way to solve it is to use source location information with at least as much fidelity as Clang is able to handle. Given that one of the benefits to CIR is for better codegen analysis that can lead to improved diagnostics, having this as an early design concern is pretty reasonable IMO. That said, we're not insisting on a change as part of this PR -- just that the CIR folks understand that this is a design concern with the facilities and at some point it may go from "concern" to "required to improve before we can proceed."