Skip to content

Migrate Guides with Cloud metrics from PromQL v0 to OpenMetrics v1#4246

Open
dustin-temporal wants to merge 3 commits intomainfrom
openmetrics-v1-migration
Open

Migrate Guides with Cloud metrics from PromQL v0 to OpenMetrics v1#4246
dustin-temporal wants to merge 3 commits intomainfrom
openmetrics-v1-migration

Conversation

@dustin-temporal
Copy link
Contributor

@dustin-temporal dustin-temporal commented Feb 26, 2026

Summary

  • worker-health: Replace temporal_cloud_v0_poll_success_count, _sync_count, _timeout_count with v1 equivalents. Remove rate() wrapping from all Cloud metric queries (v1 metrics are pre-computed rates). Add temporal_cloud_v1_approximate_backlog_count guidance (was a TODO in the source). Update reference links to OpenMetrics metrics reference.
  • service-health: Fix metric display names (frontend_service_error_count -> service_error_count to match actual v1 names). Replace SDK temporal_activity_execution_failed with Cloud temporal_cloud_v1_activity_fail_count. Remove increase() from query on pre-computed rate metrics. Replace vague total_activities formula with concrete v1 metrics.
  • ha-monitoring: Replace v0 histogram queries (histogram_quantile(0.99, sum(rate(temporal_cloud_v0_replication_lag_bucket...)))) with simple v1 percentile references (temporal_cloud_v1_replication_lag_p99). Add p95 query. Replace temporal_cloud_v0_total_action_count with v1. Update reference links.

SDK metrics (schedule-to-start latency, task slots, sticky cache) are intentionally left as-is on worker-health since they have no Cloud v1 equivalent.

Test plan

  • Verify all metric names match the OpenMetrics metrics reference
  • Verify v1 queries render correctly in Grafana (no rate()/increase() on pre-computed rates)
  • Confirm approximate_backlog_count guidance is accurate for current OpenMetrics Public Preview
  • Check that all internal links resolve correctly

🤖 Generated with Claude Code

┆Attachments: EDU-5957 Migrate Cloud metrics from PromQL v0 to OpenMetrics v1

Replace temporal_cloud_v0_* metrics with temporal_cloud_v1_* equivalents
across worker-health, service-health, and HA monitoring pages.

Key changes:
- worker-health: Replace v0 poll metrics with v1, remove rate() wrapping
  from queries (v1 metrics are pre-computed rates), add approximate_backlog_count
- service-health: Fix metric display names (drop frontend_ prefix), replace
  SDK activity_execution_failed with Cloud v1 activity_fail_count, fix query
  to not use increase() on pre-computed rates
- ha-monitoring: Replace v0 histogram queries with v1 pre-computed percentiles,
  add p95 query, update reference links

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Feb 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
temporal-documentation Ready Ready Preview, Comment Feb 26, 2026 7:48pm

Request Review

@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2026

📖 Docs PR preview links

@dustin-temporal dustin-temporal changed the title Migrate Cloud metrics from PromQL v0 to OpenMetrics v1 Migrate Guides with Cloud metrics from PromQL v0 to OpenMetrics v1 Feb 26, 2026
@dustin-temporal dustin-temporal marked this pull request as ready for review February 26, 2026 19:48
@dustin-temporal dustin-temporal requested a review from a team as a code owner February 26, 2026 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant