Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][FIXED] Fix request/reply performance when using allow_responses perms #6064

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jack7803m
Copy link

Fixes performance issues noted in #6058. Attempts to prune reply map every replyPermLimit messages or if it has been more than replyPruneTime since the last prune.

Resolves #6058

Signed-off-by: Jack Morris [email protected]

@jack7803m jack7803m requested a review from a team as a code owner October 31, 2024 18:21
@jack7803m
Copy link
Author

Unsure how those failing tests could be affected by the minimal changes I made.

Still need to add some sort of solution to the infinite expiry, though I'm not sure exactly what direction to go with that (i.e. error or set to a default), so I'll leave that decision up to the maintainers.

server/client.go Outdated
@@ -3760,6 +3765,9 @@ func (c *client) pruneReplyPerms() {
delete(c.replies, k)
}
}

c.repliesSincePrune = 0
c.lastReplyPrune = time.Now()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just now from above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Just fixed it

@@ -3636,7 +3640,8 @@ func (c *client) deliverMsg(prodIsMQTT bool, sub *subscription, acc *Account, su
// do that accounting here. We only look at client.replies which will be non-nil.
if client.replies != nil && len(reply) > 0 {
client.replies[string(reply)] = &resp{time.Now(), 0}
if len(client.replies) > replyPermLimit {
client.repliesSincePrune++
if client.repliesSincePrune > replyPermLimit || time.Since(client.lastReplyPrune) > replyPruneTime {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a sense under heaby load how much more memory this will hold onto?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original issue was it holding onto too much memory and looping through all of it every message.

Just added some debug statements and found that if the reply subject is already allowed (eg. "pub": ">"), then the reply counter never actually goes up and therefore it never is able to prune that subject out until it expires by time. Just noting this because I'm going to be looking at a fix for that too but not sure if it'll have broader effects (hopefully not).

Assuming that the subjects are getting pruned as they're replied to, at most it should only be able to get the replies map to an extra replyPermLimit size than what it could've possible been before in the worst case scenario. Even for that to happen it would have to fill the map with subjects, attempt to prune, then expire them all by the next message - in that case, pruning for every message over the replyPermLimit would cause it to immediately prune whereas this solution would hold onto that memory for the next replyPermLimit messages, making the map size replyPermLimit * 2.

Under normal heavy load it shouldn't make any significant difference, as the current behavior typically should only run the prune once every replyPermLimit messages anyway when it's configured properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Severe request/reply performance hit when using allow_responses map
2 participants