Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix regression when throttle timeout is marginal #1572

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

MDSLKTR
Copy link

@MDSLKTR MDSLKTR commented May 8, 2023

Fixes
#1556

Tries to target the regression that was caused by the mentioned issue. The change to check whether the throttleTimeout is actually greater than 0 is readded. To cover for cases the result of the timeUntilUnthrottled calculation was actually off by a small margin might be fixed by storing the timestamp when the throttleing was applied. And after the check runs, updating it. Please let me now if i missed something as this is my first draft for this. The test that was added works with this but it might still be lacking something.

@MDSLKTR MDSLKTR force-pushed the fix-regression-when-throttle-timeout-is-marginal branch from 339967c to 4abe2c3 Compare May 9, 2023 05:31
@MDSLKTR MDSLKTR changed the title fix regression when timeout is marginal fix regression when throttle timeout is marginal May 9, 2023
if (this.pending.length > 0) {
scheduleAt = scheduleAt > 0 ? scheduleAt : CHECK_PENDING_REQUESTS_INTERVAL
}
const timeUntilUnthrottled = this.throttledUntil - this.throttleCurrentTimestamp
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I'm missing something here. From reading this code it looks to me like we're going to be checking every single time around the event loop, since the setTimeout is always called with 0.

So the very first time we get throttled, we'll set throttleCurrentTimestamp to 0 and throttledUntil to 10. timeUntilThrottled is 10 - 0, which is greater than 0 and we don't already have a scheduled timer, so we schedule it for next tick. Next tick happens so we null out the timeout id and we update throttleCurrentTimestamp with the current time, let's say it's 1. Now timeUntilThrottled is 10 - 1 which is still greater than 0, so we do the same thing over again.

Copy link
Author

@MDSLKTR MDSLKTR May 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, this would probably create many more calls then necessary, i could imagine that instead using Math.max(timeUntilUnthrottled, 0) would then be more applicable?

This would delay the next call until it is predicting the throttleling to be over, this marginal offset is then solved by setting the timestamp and there would not be the case where timeUntilUnthrottled is 0 i think

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be helpful to read the previous implementation (and the original one). If you check the PR that introduced this (#1532 (comment)), I made a similar suggestion. We just need to make sure we don't end up in an infinite loop

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, i tried to align with what was there before, currently this works without causing a loop, and the added test from the previous PR also passes. Not sure if there is a better way of testing this properly

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to move this forward? so far it looks good and not causing any infinite loops

@Hyjaz
Copy link

Hyjaz commented Jun 5, 2023

@Nevon I would greatly appreciate it if you could review this pull request once again and ultimately approve and merge it. We are currently facing errors that have already been fixed in version 2.2.4, but unfortunately, that version is not currently usable. Thank you!

@allanpaiste
Copy link

allanpaiste commented Jun 6, 2023

Was very glad to find this specific issue having been reported and a fix proposed for it!

Ought there to be a test that would guard against this kind of performance degradation to happen in the future?

How we ended up bumping into this was that there was an integration test that did time jumps with Sinon FakeTimer where it was clear that the internals of FakeTimer ended up throttling the actual Event Loop due to this setTimeout (and another setInterval) causing there to be a huge backlog of timer callbacks. Not saying that this is the strategy that one ought to use to guard against this, but that would indirectly create somewhat easy way of verifying that there are no overly-aggressive setTimeout-zero loops in the system.

@siimsams
Copy link

siimsams commented Jun 20, 2023

As this has been open since May we have decided to abandon hope and downgrade to v1.16 to not have these issues. :(

@edisonpaul4
Copy link

This is really sad, we had to upgrade to KafkaJS and all the problems related tu CPU usage started... we were using KafkaJS 1.6.0 and no problem... this fix WORKS why so many months with out doing anything? Basically KafkaJS 2 is just bad.

Be aware of updating to KafkaJS 2.x.x

@MDSLKTR
Copy link
Author

MDSLKTR commented Aug 22, 2023

i feel like this PR was used to vent a lot, however in order to contribute, i think it makes sense to move this forward and release it as a patch version

Copy link
Contributor

@ekanth ekanth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MDSLKTR! I caught up on this change and the fix sounds reasonable to me. @Nevon - @MDSLKTR has addressed your earlier comments. While we understand that you might be busy, really appreciate if you could merge this fix and release 2.2.5. It looks like a lot of people are waiting for this one! 🙏🏼

@MDSLKTR
Copy link
Author

MDSLKTR commented Oct 19, 2023

Any update here?

@MDSLKTR
Copy link
Author

MDSLKTR commented Jan 4, 2024

Feel free to try out installing: https://github.com/MDSLKTR/kafkajs.git#2.2.5, in case that helps your case, no guarantees though obviously

@sefiros1
Copy link

sefiros1 commented Jan 10, 2024

@MDSLKTR thank a lot for your work! How soon can we expect this change in the official release? It seems to me that the previous reviewer (@Nevon) has gone inactive in this thread. @ekanth may you help somehow to push this through?

@MDSLKTR
Copy link
Author

MDSLKTR commented Mar 11, 2024

Our team will be switching back to node-rdkafka while keeping in mind that the official client will be ready at some point https://github.com/confluentinc/confluent-kafka-javascript
So i wont pursue this any further

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants