Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OM 2.0: Consider using complex values instead of suffixes #283

Open
dashpole opened this issue Dec 2, 2024 · 7 comments
Open

OM 2.0: Consider using complex values instead of suffixes #283

dashpole opened this issue Dec 2, 2024 · 7 comments

Comments

@dashpole
Copy link

dashpole commented Dec 2, 2024

This deserves its own proposal, but i'll outline the broad idea here start the discussion and gather high-level feedback.

Idea

The idea is that we could use complex values for fixed bucket histograms, summaries, and counters, similar to what we plan to do for native histograms. In OM 1.0 and in the Prometheus text format, those types are represented using multiple series with suffixes and special labels (e.g. _bucket suffix, or the le label for histograms). Counters are included here because they have a "total" and a "start time".

Advantages

Disadvantages

  • Breaks existing user expectations and queries. Suffixes are very deeply embedded in the Prometheus ecosystem, and this would be a large change for many users.
  • PromQL queries become more complicated because accessing fields requires functions.
    • E.g. sum(request_duration_seconds_count) -> histogram_count(sum(request_duration_seconds)).
    • This would happen anyways if the user migrates to native histograms.
  • Little benefit for summaries and histograms, as we expect/recommend users adopt native histograms anyways.
  • Is there a text format representation that is readable AND easy to generate AND efficient enough to parse, for such a model?
  • It would make OM 2.0 text significantly different to 1.0 and Prometheus text, so some education and big change in parsers/generators would be needed. Not a blocker, but something to keep in mind as a con.

Alternatives

We could only use a complex value for counters, to support the start time in addition to the value. For fixed-bucket histograms and summaries, keep the existing suffixes and labels, but use the "complex counter" value for cumulative series. For users that have migrated from summaries and fixed-bucket histograms to native histograms, this has most of the advantages of the above, without many of the disadvantages for users using summaries or fixed-bucket histograms.

@ArthurSens
Copy link
Member

With Native Histograms stabilizing, is there a scenario where a user would prefer classic histograms over native histograms? If not, should we invest effort in this? 😅

Solves prometheus/prometheus#6541 by supporting the created timestamp in the value.

We also planned to move _created to something similar to the way we handle metadata. Would that be enough for classic histograms as well?

In our first call, we mentioned that one of our intentions with 2.0 is not to break the Prometheus user base to make the spec more favorable to other specs, so I'm not sure how we could commit to this one 😬

@bwplotka
Copy link
Member

bwplotka commented Dec 4, 2024

I think this would be great to consider, thanks! Essentially it removes metric family notion. We kind of do similar in protobuf format already.

Also parses could generate non-complex types for classic histograms and summaries if needed.

Two extra downsides to think about:

  • Is there a text format representation that is readable AND easy to generate AND efficient enough to parse, for such a model?
  • It would make OM 2.0 text significantly different to 1.0 and Prometheus text, so some education and big change in parsers/generators would be needed. Not a blocker, but something to keep in mind as a con.

@bwplotka
Copy link
Member

bwplotka commented Dec 4, 2024

With Native Histograms stabilizing, is there a scenario where a user would prefer classic histograms over native histograms? If not, should we invest effort in this? 😅

Yes and we could simply do the NHCB (custom buckets) straight in the text too.

@bwplotka
Copy link
Member

bwplotka commented Dec 4, 2024

In our first call, we mentioned that one of our intentions with 2.0 is not to break the Prometheus user base to make the spec more favorable to other specs, so I'm not sure how we could commit to this one 😬

How it is breaking? While it would cause parser redesign, you can represent the current Prometheus model just fine with this idea, no?

@dashpole
Copy link
Author

dashpole commented Dec 4, 2024

Added to the list of cons.

@dashpole
Copy link
Author

dashpole commented Dec 5, 2024

Unsurprisingly, i'm not the first one to suggest this. @bwplotka pointed me to https://github.com/prometheus/OpenMetrics/blob/main/legacy/markdown/protobuf_vs_text.md#implied-data-model, by @beorn7 which contains a much more thorough description of the tradeoffs involved with using single-line representations of complex types.

@dashpole
Copy link
Author

dashpole commented Dec 5, 2024

Regarding PromQL queries become more complicated because accessing fields requires functions., I learned that this migration may be happening independent of this proposal based on https://github.com/prometheus/proposals/blob/main/proposals/2024-01-26_classic-histograms-stored-as-native-histograms.md#reading-via-promql.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

3 participants