-
Notifications
You must be signed in to change notification settings - Fork 36.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc: skip getpeerinfo for a peer without CNodeStateStats #26515
rpc: skip getpeerinfo for a peer without CNodeStateStats #26515
Conversation
// exist for the duration of the connection - except if there is a race where the | ||
// peer got disconnected in between the GetNodeStats() and the GetNodeStateStats() | ||
// calls. In this case, the peer doesn't need to be reported here. | ||
if (!fStateStats) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't tested yet if this works, but checking for fStateStats
seems a bit of a smell in general and I wanted to remove it entirely, e.g. #25923, though the way you use it here looks like a real improvement that could be orthogonal to much of that pull.
Note that you haven't updated the code for the GUI peers details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking into #25923 right now - it seems that it does some what Marco suggested below, making some fields, like txrelay, optional for an intrinsic reason instead of using defaults like -1 or 0 when the info is not known.
As for the GUI - I didn't change it on purpose, because this comment scared me off. I think that #25923 is affected by that too, I'll comment there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TRY_LOCK
for accessing CNodeStats
has been in the GUI since peer information was added in #4225 (see #4225 (comment)). I'm not sure why it was decided that the GUI could block on taking cs_vNodes
, but can't block on taking cs_main
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure either, but one reason might be because it slows down IBD a little if the GUI constantly competes with validation for cs_main
just to update the peers window. Another reason might be that cs_main
is held more often and longer by the node compared to cs_vNodes
(current name m_nodes_mutex
), so it might become noticeable that the GUI freezes. In both of these cases, skipping some of the requests from the GUI could make sense.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Concept ACK, will look further at this change. |
Do we really want this, given that the fields have been optional forever(?)? I think we should first think if there are fields that should be optional. If the answer to that is no, then this seems fine to do, but without an answer, it seems too early to change. |
I think that if we make any fields optional, we should have a reason for this, and communicate to the users within the RPC help in which situations these fields won't be included. The reason should be something that has to do with the actual meaning of the field, e.g. if there is a good reason why it's not available under certain conditions (like it's currently being done for fields like |
yes, agree. This is what I was trying to say we should answer first before changing the code. |
Ok, I'll look into this field by field before taking this out of draft. However, note that if we decide to keep certain field optional for an intrinsic reason, it would likely be a behavior change, because we currently include all |
Heh, this might actually be a convincing argument to go ahead with the changes here as-is. (And then add a new field name, if needed, for "truly" optional fields instead of breaking the behaviour?) |
Though,
I don't think this part is true. |
There is no situation in which CNodeStateStats could be missing for a legitimate reason - this can only happen if there is a race condition between peer disconnection and the getpeerinfo call, in which case the disconnected peer doesn't need to be included in the response.
ec839fe
to
6fefd49
Compare
I removed the part that makes the
Removed from OP |
Approach ACK 6fefd49
I don't think this is entirely accurate and I summarized my thinking about it below. It seems to me that this is all the result of us re-architecting our code base, specifically splitting lower level network and application layer per-peer state into separate data structures managed by different components (CConnman, PeerManager). Obviously an overall good development which we don't need to rehash the merits of here. Ideally these components are loosely coupled (they aren't really right now but I think we will/should get there eventually), such that the internal state of one is not dependent on the internal state of the other. Even though they are still tightly coupled at the moment, we should not rely on that to be the case going forward. So in my opinion, any user of the CConnman & PeerManager API (in this case the RPC handler querying statistics) should not expect there to be consistency between the internal state of these two modules and instead handle the edge cases (e.g. only one component returns stats) on its own end. I like this PR because it is essentially doing what I describe above, that is handling the edge case in which only the CNode stats are available while the application layer stats are not (due to the connection being dropped in between the queries). |
I wonder if this will introduce races in code (e.g. our tests) that assume a missing entry in the dict implies a fully disconnected node. Edit: If there was a race, it would already exist regardless of this pull. |
review ACK 6fefd49 👒 Show signatureSignature:
|
6fefd49 rpc: Require NodeStateStats object in getpeerinfo (Martin Zumsande) Pull request description: The objects `CNode`, `CNodeState` and `Peer` store different info about a peer - `InitializeNode()` and `FinalizeNode()` make sure that for the duration of a connection, we should always have one of each for a peer. Therefore, there is no situation in which, as part of getpeerinfo RPC, `GetNodeStateStats()` (which requires a `CNodeState` and a `Peer` entry for a `NodeId` to succeed) could fail for a legitimate reason while the peer is connected - this can only happen if there is a race condition between peer disconnection and the `getpeerinfo` processing (see also a more detailed description of this in #26457 (review)). But in this case I think it's better to just not include the newly disconnected peer in the response instead of returning just parts of its data. An earlier version of this PR also made the affected `CNodeStateStats` fields non-optional (see mzumsande@5f900e2). Since this conflicts with #25923 and should be a separate discussion, I removed that commit from this PR. ACKs for top commit: dergoegge: Approach ACK 6fefd49 MarcoFalke: review ACK 6fefd49 👒 Tree-SHA512: 89c8f7318df4634c1630415de9c8350e6dc2d14d9d07e039e5b180c51bfd3ee2ce99eeac4f9f858af7de846f7a6b48fcae96ebac08495b30e431a5d2d4660532
This has been merged. |
These are no-longer optional after bitcoin#26515, so remove the documentation, and no-op fStateStats checks.
…ateStats 6fefd49 rpc: Require NodeStateStats object in getpeerinfo (Martin Zumsande) Pull request description: The objects `CNode`, `CNodeState` and `Peer` store different info about a peer - `InitializeNode()` and `FinalizeNode()` make sure that for the duration of a connection, we should always have one of each for a peer. Therefore, there is no situation in which, as part of getpeerinfo RPC, `GetNodeStateStats()` (which requires a `CNodeState` and a `Peer` entry for a `NodeId` to succeed) could fail for a legitimate reason while the peer is connected - this can only happen if there is a race condition between peer disconnection and the `getpeerinfo` processing (see also a more detailed description of this in bitcoin#26457 (review)). But in this case I think it's better to just not include the newly disconnected peer in the response instead of returning just parts of its data. An earlier version of this PR also made the affected `CNodeStateStats` fields non-optional (see mzumsande@5f900e2). Since this conflicts with bitcoin#25923 and should be a separate discussion, I removed that commit from this PR. ACKs for top commit: dergoegge: Approach ACK 6fefd49 MarcoFalke: review ACK 6fefd49 👒 Tree-SHA512: 89c8f7318df4634c1630415de9c8350e6dc2d14d9d07e039e5b180c51bfd3ee2ce99eeac4f9f858af7de846f7a6b48fcae96ebac08495b30e431a5d2d4660532
There is no situation in which CNodeStateStats could be missing for a legitimate reason - this can only happen if there is a race condition between peer disconnection and the getpeerinfo call, in which case the disconnected peer doesn't need to be included in the response. Github-Pull: bitcoin#26515 Rebased-From: 6fefd49
1dc0e4b rpc: remove optional from fStateStats fields (fanquake) Pull request description: These are no-longer optional after #26515, so remove the documentation, and no-op `fStateStats` checks. ACKs for top commit: dergoegge: Code review ACK 1dc0e4b Tree-SHA512: 06d4550e866341b379bfdbc72d67d71a3b7ceceec06ebd4c5e6f178b75fe40cbf4aff51adba1bc52590e69e818cbdecb0366bf1528c59c5c3dff5bbdba8eac68
…t CNodeStateStats e72313e rpc: Require NodeStateStats object in getpeerinfo (Martin Zumsande) Pull request description: Backports #26515. ACKs for top commit: fanquake: ACK e72313e Tree-SHA512: 28e885ea299fe8a3a7538628d413c434bc42c251a2c1ae238bca0652709f5bd781eb157675171ab538f6e2f6f720f1c184200ac3857f6c78f48858949eed49da
Was backported to 24.x in #26457. |
The objects
CNode
,CNodeState
andPeer
store different info about a peer -InitializeNode()
andFinalizeNode()
make sure that for the duration of a connection, we should always have one of each for a peer.Therefore, there is no situation in which, as part of getpeerinfo RPC,
GetNodeStateStats()
(which requires aCNodeState
and aPeer
entry for aNodeId
to succeed) could fail for a legitimate reason while the peer is connected - this can only happen if there is a race condition between peer disconnection and thegetpeerinfo
processing (see also a more detailed description of this in #26457 (review)).But in this case I think it's better to just not include the newly disconnected peer in the response instead of returning just parts of its data.
An earlier version of this PR also made the affected
CNodeStateStats
fields non-optional (see mzumsande@5f900e2). Since this conflicts with #25923 and should be a separate discussion, I removed that commit from this PR.