-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more logs and/or duration fields to existing logs to give more visibility into slow reconciliations #18923
Comments
Very strongly in favor of this. We also need documentation about these different metrics. |
…proj#18923] Closes argoproj#18923 There are some gaps in debugging information for long reconciliations. Fill in a lot of those gaps by adding more debug logs with timing information about different code execution steps. Signed-off-by: Andrii Korotkov <[email protected]>
argoproj#18923] Closes argoproj#18923 There are some gaps in debugging information for long reconciliations. Fill in a lot of those gaps by adding more debug logs with timing information about different code execution steps. Signed-off-by: Andrii Korotkov <[email protected]>
@crenshaw-dev, I've sent a PR. There are a lot of metrics being added though and I've followed function names to name most of those. What's the best way to write a documentation for these? Should it be oriented at people who'd read the controller code, or should it also appeal to people who won't? |
argoproj#18923] Closes argoproj#18923 There are some gaps in debugging information for long reconciliations. Fill in a lot of those gaps by adding more debug logs with timing information about different code execution steps. Signed-off-by: Andrii Korotkov <[email protected]>
argoproj#18923] Closes argoproj#18923 There are some gaps in debugging information for long reconciliations. Fill in a lot of those gaps by adding more debug logs with timing information about different code execution steps. Also, fix a flaky test in app_test.go. Signed-off-by: Andrii Korotkov <[email protected]>
argoproj#18923] Closes argoproj#18923 There are some gaps in debugging information for long reconciliations. Fill in a lot of those gaps by adding more debug logs with timing information about different code execution steps. Also, fix a flaky test in app_test.go. Signed-off-by: Andrii Korotkov <[email protected]>
Already discovered slowness/optimization opportunity! #18929 |
Documents argoproj#18923 Add the info about existing and new logs that are being added for reconciliation. Signed-off-by: Andrii Korotkov <[email protected]>
The timing started before reconciliation timing started including get from the queue, leading to very big times reported, not making sense for what's actually going on. Signed-off-by: Andrii Korotkov <[email protected]>
The timing started before reconciliation timing started including get from the queue, leading to very big times reported, not making sense for what's actually going on. Signed-off-by: Andrii Korotkov <[email protected]>
argoproj#18923] (argoproj#18926) Closes argoproj#18923 There are some gaps in debugging information for long reconciliations. Fill in a lot of those gaps by adding more debug logs with timing information about different code execution steps. Also, fix a flaky test in app_test.go. Signed-off-by: Andrii Korotkov <[email protected]>
argoproj#18923] (argoproj#18926) Closes argoproj#18923 There are some gaps in debugging information for long reconciliations. Fill in a lot of those gaps by adding more debug logs with timing information about different code execution steps. Also, fix a flaky test in app_test.go. Signed-off-by: Andrii Korotkov <[email protected]>
The timing started before reconciliation timing started including get from the queue, leading to very big times reported, not making sense for what's actually going on. Signed-off-by: Andrii Korotkov <[email protected]>
The timing started before reconciliation timing started including get from the queue, leading to very big times reported, not making sense for what's actually going on. Signed-off-by: Andrii Korotkov <[email protected]> Signed-off-by: Vegard Færgestad <[email protected]>
argoproj#18923] (argoproj#18926) Closes argoproj#18923 There are some gaps in debugging information for long reconciliations. Fill in a lot of those gaps by adding more debug logs with timing information about different code execution steps. Also, fix a flaky test in app_test.go. Signed-off-by: Andrii Korotkov <[email protected]> Signed-off-by: Javier Solana <[email protected]> Signed-off-by: Javier Solana <[email protected]>
The timing started before reconciliation timing started including get from the queue, leading to very big times reported, not making sense for what's actually going on. Signed-off-by: Andrii Korotkov <[email protected]> Signed-off-by: Javier Solana <[email protected]> Signed-off-by: Javier Solana <[email protected]>
Summary
There's some logging for different steps taken during reconciliation, as well as some duration fields on "Reconciliation completed" log entries. However, they aren't enough to tell where the slowness may be coming from.
Motivation
Sometimes reconciliation takes 30+ min and it's unclear why. Logs don't give enough info.
Proposal
Add more log entries through different steps of reconciliation and/or add more timing/duration information to "Reconciliation completed".
The text was updated successfully, but these errors were encountered: