Default values for histogram bucket boundaries are oriented around milliseconds rather than seconds #5821

jakegavin · 2024-09-13T20:48:26Z

Problem Statement

Histograms are commonly used for recording latencies. The default values for bucket boundaries are []float64{0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000} (code). This works well when working with milliseconds however the Prometheus documentation recommends using seconds, rather than milliseconds for units. When recording latency metrics in seconds with the default buckets, the vast majority of timings will land in the 0 second to 5 seconds bucket. This results inaccurate histogram quantile calculations.

This is very similar to this issue in the .NET repo: open-telemetry/opentelemetry-dotnet#4797

Proposed Solution

opentelemetry-go could use a different set of default buckets when the histogram units are known to be seconds.

This was implemented in the .NET library here: open-telemetry/opentelemetry-dotnet#4820

Alternatives

The current workaround is to use the WithExplicitBucketBoundaries option on all histograms dealing in seconds.

Prior Art

.NET issue: open-telemetry/opentelemetry-dotnet#4797
.NET solution: open-telemetry/opentelemetry-dotnet#4820

Additional Context

This would likely be a breaking change.

The text was updated successfully, but these errors were encountered:

MrAlias · 2024-09-13T23:44:23Z

This would likely be a breaking change.

Agreed. This is the reason we have not made the change.

It's also the reason explicitly called out in the specification:

SDKs SHOULD use the default value when boundaries are not explicitly provided, unless they have good reasons to use something different (e.g. for backward compatibility reasons in a stable SDK release).

This does not look like a proposal we plan to accept.

dmathieu · 2024-11-27T09:56:50Z

Closing this per @MrAlias's comment, as this would be a breaking changes in both semver and the specification.

aslatter · 2025-02-01T21:52:38Z

For anyone else finding this, it looks like most built-in exporters support specifying a different default-bucket-set when constructing the exporter. For example, with the prom-exporter:

	// import otelprom "go.opentelemetry.io/otel/exporters/prometheus"
	// import sdkmetric "go.opentelemetry.io/otel/sdk/metric"

	// create an otel metric-exporter associated with a prometheus registry
	metricExporter, err := otelprom.New(
		otelprom.WithRegisterer(promRegistry),

		// OTEL default buckets assume you're using milliseconds. Substitute defaults
		// appropriate for units of seconds.
		otelprom.WithAggregationSelector(func(ik sdkmetric.InstrumentKind) sdkmetric.Aggregation {
			switch ik {
			case sdkmetric.InstrumentKindHistogram:
				return sdkmetric.AggregationExplicitBucketHistogram{
					Boundaries: prometheus.DefBuckets,
					NoMinMax:   false,
				}
			default:
				return sdkmetric.DefaultAggregationSelector(ik)
			}
		}),
	)
	// do something with err

	// create a meter-provider associated with the exporter
	meterProvider := sdkmetric.NewMeterProvider(
		sdkmetric.WithReader(metricExporter),
	)

	// do something with meterProvider

jakegavin added the enhancement New feature or request label Sep 13, 2024

CCOLLOT mentioned this issue Nov 8, 2024

[AutoInstrumentation] NodeJS histogram buckets should be configurable open-telemetry/opentelemetry-operator#3436

Open

dmathieu closed this as not planned Won't fix, can't repro, duplicate, stale Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default values for histogram bucket boundaries are oriented around milliseconds rather than seconds #5821

Default values for histogram bucket boundaries are oriented around milliseconds rather than seconds #5821

jakegavin commented Sep 13, 2024

MrAlias commented Sep 13, 2024

dmathieu commented Nov 27, 2024

aslatter commented Feb 1, 2025

Default values for histogram bucket boundaries are oriented around milliseconds rather than seconds #5821

Default values for histogram bucket boundaries are oriented around milliseconds rather than seconds #5821

Comments

jakegavin commented Sep 13, 2024

Problem Statement

Proposed Solution

Alternatives

Prior Art

Additional Context

MrAlias commented Sep 13, 2024

dmathieu commented Nov 27, 2024

aslatter commented Feb 1, 2025