-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable scheduleSend when onMessage callback is running #144
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #144 +/- ##
==========================================
+ Coverage 76.11% 77.02% +0.90%
==========================================
Files 24 24
Lines 1834 1893 +59
==========================================
+ Hits 1396 1458 +62
+ Misses 326 324 -2
+ Partials 112 111 -1
☔ View full report in Codecov by Sentry. |
client/internal/receivedprocessor.go
Outdated
func (r *receivedProcessor) onMessage(ctx context.Context, msgData *types.MessageData) { | ||
r.senderCommon.DisableScheduleSend() | ||
r.callbacks.OnMessage(ctx, msgData) | ||
r.senderCommon.EnableScheduleSend() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: move this up and use defer
so that if the code becomes more complicated in the future there is no chance that EnableScheduleSend()
is not executed on some of the return paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM, I just left a few comments about improving the comments (pun not intended).
r.callbacks.OnMessage(ctx, msgData) | ||
|
||
r.sender.EnableScheduleSend() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a comment to EnableScheduleSend definition that calling EnableScheduleSend when it is already enabled is allowed? (since we do it twice here).
client/internal/sender.go
Outdated
@@ -31,6 +38,12 @@ type SenderCommon struct { | |||
// Indicates that there is a pending message to send. | |||
hasPendingMessage chan struct{} | |||
|
|||
// Indicates onMessage callback is running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: the comment doesn't match what the field name implies. It would be best to reword it to something like "Set to non-zero to indicate that the message sending is disabled"
client/internal/sender.go
Outdated
} | ||
} | ||
|
||
// ScheduleSend signals to HTTPSender that the message in NextMessage struct | ||
// is now ready to be sent. If there is no pending message (e.g. the NextMessage was | ||
// already sent and "pending" flag is reset) then no message will be sent. | ||
func (h *SenderCommon) ScheduleSend() { | ||
if h.IsSendingDisabled() { | ||
// onMessage callback is running, ScheduleSend() will rerun after it is done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, comment doesn't match what the code and func names imply.
client/internal/sender.go
Outdated
atomic.StoreInt32(&h.isSendingDisabled, 1) | ||
} | ||
|
||
// EnableScheduleSend re-enables ScheduleSend and checks if it was called during onMessage callback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I would refrain from mentioning onMessage here. It is not a concern of SenderCommon.
Technically the change in this PR can be the cause of the test failure:
This happened because the server did not receive a message client was supposed to send. I haven't seen this error before and it is not reproducible on the latest I can't see the problem by reading the code but maybe there is some race that I am not seeing that causes a ScheduleSend to missfire? Can you try |
client/internal/sender.go
Outdated
select { | ||
case <-h.registerScheduleSend: | ||
h.ScheduleSend() | ||
case <-time.Tick(100 * time.Millisecond): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unclear why this is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixed my sync issue. Several tests failed when testing with -race -count 1000 flags. Honestly, I don't fully understand why but it seems that the default case was selected over reading from the registerScheduleSend channel even when it (should have) held an unread message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failing tests may be indicating a real problem. The current code (before this PR) does not fail the tests.
The addition of the second case that waits a time without explaining why it fixes the problem is not an acceptable code change. We need precise explanation what the problem is and why this is the correct fix for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed this bypass and tests are looking stable at the moment with the updated branch. Still, I don't have a smoking gun regarding the sync issue, but it could have been resolved due to recent commits. Would appreciate your opinion on whether or not this should stay blocked or more changes are needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking until we have clarity around failing tests.
@nemoshlag, any chance you can rebase and resolve the conflicts? |
1842cd8
to
ad3a057
Compare
Resolves #88