Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DumpId() methods to Check::Context, Parse::Tree, Lex::TokenizedBuffer #4620

Closed
wants to merge 17 commits into from

Conversation

danakj
Copy link
Contributor

@danakj danakj commented Dec 3, 2024

No description provided.

@danakj danakj force-pushed the dump branch 2 times, most recently from 856f2e7 to 9bb864a Compare December 4, 2024 17:36
@danakj danakj requested a review from jonmeow December 5, 2024 19:35
danakj added 11 commits December 5, 2024 14:35
…ffer

These methods are useful for calling in a debugger, and will dump
verbose information from an id value.

Currently this is done in a pretty simple manner. The Lex and Parse
methods are each replicated onto the Check::Context class manually (and
same for Lex onto Parse::Tree).

The DumpId methods are implemented in this PR for:
- Lex::TokenIndex
- Parse::NodeId
- SemIR::LocId
@danakj danakj changed the title Add a Check::Context::Dump(T) that can be overloaded for any T Add DumpId() methods to Check::Context, Parse::Tree, Lex::TokenizedBuffer Dec 5, 2024
@danakj
Copy link
Contributor Author

danakj commented Dec 5, 2024

Thanks for all the feedback on discord #toolchain. After determining what is possible this is a slimmer version that uses explicit forwarding from Check->Parse/Lex and Parse->Lex for dumping as suggested by @jonmeow . I have used the name DumpId as suggested by @geoffromer. I did not use a template, as the interactive debugger experience seems to be degraded (having to type angle brackets and types) even if we can resolve the linking issues created.

@danakj danakj marked this pull request as ready for review December 5, 2024 19:41
@github-actions github-actions bot requested a review from geoffromer December 5, 2024 19:41
toolchain/check/dump_id.h Outdated Show resolved Hide resolved
toolchain/check/dump_id.h Outdated Show resolved Hide resolved
toolchain/check/dump_id.h Outdated Show resolved Hide resolved
toolchain/check/dump_id.cpp Outdated Show resolved Hide resolved
@danakj danakj requested a review from geoffromer December 9, 2024 19:30
@jonmeow
Copy link
Contributor

jonmeow commented Dec 9, 2024

Regarding my comment about being hesitant about adding Context member functions on #toolchain, I really meant something like context.dumper().Dump where the dumper could be constructed with a pointer to Context, similar to what we do with DeclNameStack and other functionality. What led you to choose the CRTP base class approach?

(to be sure, I think we agreed the Dumpable CRTP base class approach on individual IDs wouldn't work -- I hadn't expected a different kind of CRTP base class on the main Context object)

Comment on lines 46 to 48
auto DumpIdMethods<Context>::Newline() const -> void {
llvm::errs() << "\n";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the function for this instead of doing the write directly? Maybe the function should have some comments if it's needed?

Also, maybe this should be WriteNewline? We typically name functions describing the action being taken, and it actually was unclear to me what Newline would do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid putting streaming includes into the header file, since operator<< is bad for compile times. Is this something we think about here? I took a quick look around and saw that the streaming includes had been generally avoided in headers elsewhere so far.

Copy link
Contributor

@jonmeow jonmeow Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure which include you're looking for, but "common/ostream.h" seems to be relatively common in headers. (it's not something we avoid, it's actually more by design for correctness)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok noted, thanks, I will not try to avoid specific includes in header files then. I moved to using this header everywhere.

@danakj
Copy link
Contributor Author

danakj commented Dec 9, 2024

Regarding my comment about being hesitant about adding Context member functions on #toolchain, I really meant something like context.dumper().Dump where the dumper could be constructed with a pointer to Context, similar to what we do with DeclNameStack and other functionality. What led you to choose the CRTP base class approach?

I wanted to reduce the number of steps you have to go through to dump things. Having to find and remember to construct a dumper type felt worse than having the methods on Context directly. But having a bunch of overloads right in context.h won't be great, so this felt like the right compromise to me.

The use of CRTP specifically is to avoid state (passing a pointer up to the base class) and to avoid a Context parameter on the methods.

@danakj danakj requested a review from jonmeow December 9, 2024 20:18
@jonmeow
Copy link
Contributor

jonmeow commented Dec 9, 2024

Regarding my comment about being hesitant about adding Context member functions on #toolchain, I really meant something like context.dumper().Dump where the dumper could be constructed with a pointer to Context, similar to what we do with DeclNameStack and other functionality. What led you to choose the CRTP base class approach?

To be sure, I'm suggesting a construction wouldn't be needed, it's just writing context.dumper().Dump instead of context.Dump

I wanted to reduce the number of steps you have to go through to dump things. Having to find and remember to construct a dumper type felt worse than having the methods on Context directly. But having a bunch of overloads right in context.h won't be great, so this felt like the right compromise to me.

The use of CRTP specifically is to avoid state (passing a pointer up to the base class) and to avoid a Context parameter on the methods.

Maintaining shortness, what if we had Carbon::Dump(Context&, ..) Carbon::Dump(TokenizedBuffer&, ...) etc? i.e., Dump(context instead of context.Dump

Copy link
Contributor

@geoffromer geoffromer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be sure, I'm suggesting a construction wouldn't be needed, it's just writing context.dumper().Dump instead of context.Dump

That's more verbose than I'd like, considering that this function is intended to be called by typing out the call expression live.

Maintaining shortness, what if we had Carbon::Dump(Context&, ..) Carbon::Dump(TokenizedBuffer&, ...) etc? i.e., Dump(context instead of context.Dump

That seems like it could work, assuming I can call it from a debugger without the Carbon:: prefix. (I'm not sure if the change from DumpId to Dump was part of your point, but I think we should reserve Dump for functions that print the value itself, rather than printing the thing the value refers to).

toolchain/check/dump_id.h Outdated Show resolved Hide resolved
Comment on lines 22 to 25
// This class is inherited by `Check::Context`, which provides itself as the
// template parameter. The methods are provided here instead of on `Context`
// directly to avoid cluttering the `Context` class with overloads for every
// dumpable id type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a good tradeoff. The clutter of making these methods of Context seems minimal compared to the complexity associated with the CRTP mixin (especially given how many methods Context already has).

toolchain/check/dump_id.h Outdated Show resolved Hide resolved

#include "toolchain/lex/dump_id.h"

#include "llvm/Support/raw_ostream.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, to the point about common/ostream.h -- please include it instead of llvm/Support/raw_ostream.h. As noted in common/ostream.h, I'm trying to avoid having people include raw_ostream.h directly, in order to get more consistent behavior.

@danakj
Copy link
Contributor Author

danakj commented Dec 10, 2024

To be sure, I'm suggesting a construction wouldn't be needed, it's just writing context.dumper().Dump instead of context.Dump

That's more verbose than I'd like, considering that this function is intended to be called by typing out the call expression live.

Maintaining shortness, what if we had Carbon::Dump(Context&, ..) Carbon::Dump(TokenizedBuffer&, ...) etc? i.e., Dump(context instead of context.Dump

That seems like it could work, assuming I can call it from a debugger without the Carbon:: prefix. (I'm not sure if the change from DumpId to Dump was part of your point, but I think we should reserve Dump for functions that print the value itself, rather than printing the thing the value refers to).

Hm, I tried this, with free functions in the Carbon namespace.

namespace Carbon {
LLVM_DUMP_METHOD auto DumpId(const Check::Context& context, SemIR::LocId loc_id)
    -> void;
}  // namespace Carbon

But gdb is not finding them at all, Carbon:: or not.

(gdb) call DumpId(*this, loc_id_and_inst.loc_id)
No symbol "DumpId" in current context.
(gdb) call Carbon::DumpId(*this, loc_id_and_inst.loc_id)
No symbol "DumpId" in namespace "Carbon".

@danakj
Copy link
Contributor Author

danakj commented Dec 10, 2024

But gdb is not finding them at all, Carbon:: or not.

I think --gc-sections is going to just remove free functions on us. Either __attribute__((used)) is keeping methods around on types that are used or they are just kept generally in our debug builds. But free functions, marked with used or not, are being removed.

I can move them to just be methods in the respective types though.

@danakj danakj requested review from geoffromer and jonmeow December 10, 2024 20:06
@dwblaikie
Copy link
Contributor

To be sure, I'm suggesting a construction wouldn't be needed, it's just writing context.dumper().Dump instead of context.Dump

That's more verbose than I'd like, considering that this function is intended to be called by typing out the call expression live.

Maintaining shortness, what if we had Carbon::Dump(Context&, ..) Carbon::Dump(TokenizedBuffer&, ...) etc? i.e., Dump(context instead of context.Dump

That seems like it could work, assuming I can call it from a debugger without the Carbon:: prefix. (I'm not sure if the change from DumpId to Dump was part of your point, but I think we should reserve Dump for functions that print the value itself, rather than printing the thing the value refers to).

Hm, I tried this, with free functions in the Carbon namespace.

namespace Carbon {
LLVM_DUMP_METHOD auto DumpId(const Check::Context& context, SemIR::LocId loc_id)
    -> void;
}  // namespace Carbon

But gdb is not finding them at all, Carbon:: or not.

(gdb) call DumpId(*this, loc_id_and_inst.loc_id)
No symbol "DumpId" in current context.
(gdb) call Carbon::DumpId(*this, loc_id_and_inst.loc_id)
No symbol "DumpId" in namespace "Carbon".

Huh, interesting - could you post a commit/hash/personal PR/whatnot I could use to reproduce/try this?

A small non-carbon example I tried did seem to find a Dump function in another translation unit via qualified name lookup, so I'm not sure what's different in your case/wouldn't mind poking around to figur eit out.

Copy link
Contributor

@geoffromer geoffromer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but I'm leaving it open because I'm not sure Jon's concerns have been addressed.

@dwblaikie
Copy link
Contributor

To be sure, I'm suggesting a construction wouldn't be needed, it's just writing context.dumper().Dump instead of context.Dump

That's more verbose than I'd like, considering that this function is intended to be called by typing out the call expression live.

Maintaining shortness, what if we had Carbon::Dump(Context&, ..) Carbon::Dump(TokenizedBuffer&, ...) etc? i.e., Dump(context instead of context.Dump

That seems like it could work, assuming I can call it from a debugger without the Carbon:: prefix. (I'm not sure if the change from DumpId to Dump was part of your point, but I think we should reserve Dump for functions that print the value itself, rather than printing the thing the value refers to).

Hm, I tried this, with free functions in the Carbon namespace.

namespace Carbon {
LLVM_DUMP_METHOD auto DumpId(const Check::Context& context, SemIR::LocId loc_id)
    -> void;
}  // namespace Carbon

But gdb is not finding them at all, Carbon:: or not.

(gdb) call DumpId(*this, loc_id_and_inst.loc_id)
No symbol "DumpId" in current context.
(gdb) call Carbon::DumpId(*this, loc_id_and_inst.loc_id)
No symbol "DumpId" in namespace "Carbon".

Huh, interesting - could you post a commit/hash/personal PR/whatnot I could use to reproduce/try this?

A small non-carbon example I tried did seem to find a Dump function in another translation unit via qualified name lookup, so I'm not sure what's different in your case/wouldn't mind poking around to figur eit out.

(nevermind - sounds like @jonmeow & @zygoloid figured it out - the Dump functions maybe weren't getting linked into the resultin grogram - Jon's working on some improvements I think (not to steal his thunder, just didn't want @danakj working on a repro, etc for me when it wasn't needed anymore))

@jonmeow
Copy link
Contributor

jonmeow commented Dec 11, 2024

I think maybe there's an alwayslink missing, see #4669 for an approach that I've manually tested and appears to work. (* I should do another pass, I just split things out to separate cc_library's, but I think alwayslink is what's important [ed: did another pass, verified])

One note from discussion yesterday, @zygoloid had mentioned being dubious of overloads working, but @dwblaikie expects they should. So we're going to try overloads, counter to earlier discussion.

We'd also discussed whether DumpId is the right function name -- would it make sense to rename the functions to just Dump in the free function approach?

@danakj
Copy link
Contributor Author

danakj commented Dec 12, 2024

Moving to #4669

@danakj danakj closed this Dec 12, 2024
github-merge-queue bot pushed a commit that referenced this pull request Dec 12, 2024
- Provide `Check::Dump(context, arg)` and similar.
- gdb and lldb should do contextual lookup, and `call Dump(*this,
Lex::TokenIndex::Invalid)` has been tested with gdb.
- Since this is only for debug, keeps the functions fully separated from
code.
- Uses alwayslink to ensure objects are correctly linked, even though
there are no calls.
- `-Wno-missing-prototypes` is needed when we don't have forward
declarations.
- Code is not linked in opt builds, using `#ifndef NDEBUG`.
- This probably could be doing something in BUILD files with a
`select()`, but the `#ifndef` seemed easier.

This is based on #4620, but uses free functions instead of member
functions.

Co-authored-by: Dana Jansens <[email protected]>

---------

Co-authored-by: danakj <[email protected]>
@danakj danakj deleted the dump branch December 13, 2024 15:01
bricknerb pushed a commit to bricknerb/carbon-lang that referenced this pull request Dec 16, 2024
- Provide `Check::Dump(context, arg)` and similar.
- gdb and lldb should do contextual lookup, and `call Dump(*this,
Lex::TokenIndex::Invalid)` has been tested with gdb.
- Since this is only for debug, keeps the functions fully separated from
code.
- Uses alwayslink to ensure objects are correctly linked, even though
there are no calls.
- `-Wno-missing-prototypes` is needed when we don't have forward
declarations.
- Code is not linked in opt builds, using `#ifndef NDEBUG`.
- This probably could be doing something in BUILD files with a
`select()`, but the `#ifndef` seemed easier.

This is based on carbon-language#4620, but uses free functions instead of member
functions.

Co-authored-by: Dana Jansens <[email protected]>

---------

Co-authored-by: danakj <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants