Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddtrace/tracer: update log msg to accurately count dropped traces #2845

Merged
merged 19 commits into from
Sep 24, 2024

Conversation

hannahkm
Copy link
Contributor

@hannahkm hannahkm commented Sep 6, 2024

What does this PR do?

  • Lower the severity of the payload queue log to debug.
  • Introduce a new value in the tracer, totalTracesDropped to keep track of the number of dropped traces.
  • Log the total number of dropped traces, if they exist, every one second.

Motivation

This PR addresses an issue raised in APMAPI-225. Log messages produced when dropping chunks were too spammy (sometimes occurring > 1,000 times per second) and relayed inaccurate information.

Reviewer's Checklist

  • Changed code has unit tests for its functionality at or near 100% coverage.
  • System-Tests covering this feature have been added and enabled with the va.b.c-dev version tag.
  • There is a benchmark for any new code, or changes to existing code.
  • If this interacts with the agent in a new way, a system test has been added.
  • Add an appropriate team label so this PR gets put in the right place for the release notes.
  • Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild.

Unsure? Have a question? Request a review!

@pr-commenter
Copy link

pr-commenter bot commented Sep 6, 2024

Benchmarks

Benchmark execution time: 2024-09-24 14:26:09

Comparing candidate commit 214c1bc in PR branch hannahs/log-dropped-traces with baseline commit 7699f9e in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 58 metrics, 1 unstable metrics.

@hannahkm hannahkm marked this pull request as ready for review September 9, 2024 21:09
@hannahkm hannahkm requested a review from a team as a code owner September 9, 2024 21:09
ddtrace/tracer/tracer.go Outdated Show resolved Hide resolved
ddtrace/tracer/tracer.go Outdated Show resolved Hide resolved
@hannahkm hannahkm marked this pull request as draft September 17, 2024 14:36
@hannahkm hannahkm marked this pull request as ready for review September 17, 2024 15:07
Copy link
Contributor

@mtoffl01 mtoffl01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to block this merge, so I am approving as it looks all clear, but in my opinion and from a troubleshooting perspective, I don't love the approach of "Every X seconds, check how many traces were dropped in the previous interval, and log about it." Logging about the dropped trace at the moment it is dropped is better, but you'd have to go back to using mutexes (or related) to achieve that.

@dianashevchenko
Copy link
Contributor

@mtoffl01 we got two requests internally to change the logging because of how spammy it was. Previously, it was indeed emitting a log each time the trace was dropped. So, my suggestion would be to make the interval configurable and smaller, for example 1 second, but avoid logging at each trace drop. What do you think? I can also point you to the original slack threads.

@hannahkm
Copy link
Contributor Author

/merge

@dd-devflow
Copy link

dd-devflow bot commented Sep 24, 2024

❌ MergeQueue

You are not allowed to use the merge queue towards main.

If you need support, contact us on Slack #devflow with those details!

@hannahkm
Copy link
Contributor Author

/merge

@dd-devflow
Copy link

dd-devflow bot commented Sep 24, 2024

🚂 MergeQueue: pull request added to the queue

The median merge time in main is 0s.

Use /merge -c to cancel this operation!

@dd-devflow
Copy link

dd-devflow bot commented Sep 24, 2024

MergeQueue: The checks failed on this merge request

Tests failed on this commit dfedcf5:

What to do next?

  • Investigate the failures and when ready, re-add your pull request to the queue!
  • Any question, go check the FAQ.

If you need support, contact us on Slack #devflow with those details!

@dianashevchenko
Copy link
Contributor

/merge --cancel

@hannahkm
Copy link
Contributor Author

/merge

@dd-devflow
Copy link

dd-devflow bot commented Sep 24, 2024

🚂 MergeQueue: pull request added to the queue

The median merge time in main is 0s.

Use /merge -c to cancel this operation!

@dd-devflow
Copy link

dd-devflow bot commented Sep 24, 2024

🚨 MergeQueue: This merge request is in error

mergequeue build completed successfully, but the github api returned an error while merging the pr.
It's probably because:

  • repository configuration requires approval by code owners, but no code owner approved this PR
  • target branch of PR is restricted to only allow up-to-date branches, but the pr is now outdated
Details

Error: PUT https://api.github.com/repos/DataDog/dd-trace-go/pulls/2845/merge: 405 5 of 5 required status checks are expected. []

FullStacktrace:
activity error (type: github.GithubService_MergePullRequest, scheduledEventID: 41, startedEventID: 42, identity: 1@github-worker-b9c46c97c-gl2p7@): PUT https://api.github.com/repos/DataDog/dd-trace-go/pulls/2845/merge: 405 5 of 5 required status checks are expected. [] (type: ErrorResponse, retryable: true)

If you need support, contact us on Slack #devflow with those details!

@hannahkm
Copy link
Contributor Author

/merge

@dd-devflow
Copy link

dd-devflow bot commented Sep 24, 2024

🚂 MergeQueue: pull request added to the queue

The median merge time in main is 0s.

Use /merge -c to cancel this operation!

@dd-mergequeue dd-mergequeue bot merged commit eef52d3 into main Sep 24, 2024
144 of 146 checks passed
@dd-mergequeue dd-mergequeue bot deleted the hannahs/log-dropped-traces branch September 24, 2024 16:30
MNThomson pushed a commit to agilebits/dd-trace-go that referenced this pull request Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants