Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extended aggregation #863

Open
wants to merge 47 commits into
base: master
Choose a base branch
from

Conversation

andrewqian2001datadog
Copy link
Contributor

@andrewqian2001datadog andrewqian2001datadog commented Oct 28, 2024

Requirements for Contributing to this repository

  • Fill out the template below. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion.
  • The pull request must only fix one issue, or add one feature, at the time.
  • The pull request must update the test suite to demonstrate the changed functionality.
  • After you create the pull request, all status checks must be pass before a maintainer reviews your contribution. For more details, please see CONTRIBUTING.

What does this PR do?

Implements aggregation for histogram, distribution and timing metrics. This is essentially just buffering as these metrics should not be aggregated in the client side (From the V2 section of the design doc, "Both metrics can’t/should not be aggregated on the client side. Histogram would result in wrong metrics and merging distributions is costly while sampling is fast").

This is already implemented for the go client. Go Client PR for adding extended aggregation

Description of the Change

This PR reuses a lot of the existing buffering logic. The buffered_metrics (Histogram, Distribution and Timing) are added to the existing buffer when aggregation is enabled.

Most of the new logic is in buffered_metric_context.py which involves sampling Histograms, Distributions and Timings

Verification Process

Follow steps here to set up local testing for the python client

Replace testapp/main.py with

from datadog import initialize, statsd
import time


options = {
    "statsd_host": "127.0.0.1",
    "statsd_port": 8125,
    "statsd_disable_buffering" : True,
    "statsd_disable_aggregation" : False,
    "statsd_aggregation_flush_interval" : 61
}

initialize(**options)


x = 0
name = "andrew_q_extendedAgg99"
sleepTime = 12
while(1):
  print("-------------------------------------------------")
  print("running :)", x)
  statsd.histogram(name, 1)
  time.sleep(sleepTime)
  statsd.histogram(name, 2)
  time.sleep(sleepTime)
  statsd.histogram(name, 3)
  time.sleep(sleepTime)
  statsd.histogram(name, 4)
  time.sleep(sleepTime)
  statsd.histogram(name, 5)
  x += 1

Search for your metric name in the metrics explorer and verify values are as expected.

Additional Notes

Release Notes

Review checklist (to be filled by reviewers)

  • Feature or bug fix MUST have appropriate tests (unit, integration, etc...)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have one changelog/ label attached. If applicable it should have the backward-incompatible label attached.
  • PR should not have do-not-merge/ label attached.
  • If Applicable, issue must have kind/ and severity/ labels attached at least.

* add buffered_metrics object type

* update metric_types to include histogram, distribution, timing

* Run tests on any branch
@andrewqian2001datadog andrewqian2001datadog self-assigned this Oct 28, 2024
def should_sample(self, rate):
"""Determine if a sample should be kept based on the specified rate."""
with self.random_lock:
return self.random.random() < rate

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Code Vulnerability

do not use random (...read more)

Make sure to use values that are actually random. The random module in Python should generally not be used and replaced with the secrets module, as noted in the official Python documentation.

Learn More

View in Datadog  Leave us feedback  Documentation

@github-actions github-actions bot added the stale Stale - Bot reminder label Nov 29, 2024
@andrewqian2001datadog andrewqian2001datadog removed the stale Stale - Bot reminder label Dec 4, 2024
@DataDog DataDog deleted a comment from github-actions bot Dec 10, 2024
@andrewqian2001datadog andrewqian2001datadog marked this pull request as ready for review December 18, 2024 14:30
@andrewqian2001datadog andrewqian2001datadog added the changelog/Added Added features results into a minor version bump label Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/Added Added features results into a minor version bump
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant