Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loki.source.windowsevent: save bookmark every 10s #2558

Merged
merged 6 commits into from
Feb 5, 2025

Conversation

wildum
Copy link
Contributor

@wildum wildum commented Jan 29, 2025

PR Description

The investigation carried in #2539 shows that more than 60% of the CPU usage of this component is spent updating the bookmark file (position file). This is because it is done after every events.

With this PR, the position is updated every 10s via a goroutine instead, just like it is done in the loki.source.file component.
The position is also updated when the component exits.

PR Checklist

  • CHANGELOG.md updated
  • Tests updated

@wildum wildum requested a review from a team as a code owner January 29, 2025 14:51
@@ -226,5 +252,6 @@ func (t *Target) Stop() error {
close(t.done)
t.wg.Wait()
t.handler.Stop()
t.saveBookmarkPosition()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we're triggering the <-t.done case in updateBookmark's loop already in the Stop function, should we keep the save behavior in that function and just put it in the t.done case before returning from the function? Or is there potential changes to the bookmark in the handler before handler.Stop() returns?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, no potential changes because the "loop" goroutine is guaranteed to be stopped. I will make the change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah wait actually no, because both goroutines (loop and updateBookmark) are stopped in parallel, so to get the last position you need to make sure that the loop goroutine exited

@mattdurham
Copy link
Collaborator

Looks reasonable, should we consider this a breaking change? Its a big change in functionality that I think is warranted but we may want a bigger callout.

@wildum
Copy link
Contributor Author

wildum commented Jan 29, 2025

Looks reasonable, should we consider this a breaking change? Its a big change in functionality that I think is warranted but we may want a bigger callout.

For the user there should not be any noticeable change because the position is guaranteed to be updated when alloy exits. I believe that the only thing noticeable for the user will be a massive CPU improvement. I sent a custom installer to community users who run it at scale (see #2539).

@wildum
Copy link
Contributor Author

wildum commented Jan 29, 2025

The only risk is when Alloy does not exit properly (a panic?). Then you may repeat up to the last 10s of events that were processed and get some duplication

@mattdurham
Copy link
Collaborator

That's reasonable then, ignore more statement :), we never guarantee signals on unexpected shutdowns.

@wildum
Copy link
Contributor Author

wildum commented Feb 5, 2025

Tests at scale from community users confirm the CPU improvements without regression:

The conversation in the linked issue mentions an issue that is unrelated to this PR and which will be treated separately (#2539 (comment))

@dehaansa @mattdurham the PR is ready for another review

Copy link
Collaborator

@mattdurham mattdurham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shipit!

@wildum wildum merged commit b9118c3 into main Feb 5, 2025
30 checks passed
@wildum wildum deleted the change-loki-source-windowsevent-position branch February 5, 2025 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants