Skip to content

Add pg_synchronized_standby_slots_invalid metric#19

Merged
tom-pang merged 1 commit intomainfrom
metric-for-sync-standby-slots-incorrect
Feb 16, 2026
Merged

Add pg_synchronized_standby_slots_invalid metric#19
tom-pang merged 1 commit intomainfrom
metric-for-sync-standby-slots-incorrect

Conversation

@tom-pang
Copy link

@tom-pang tom-pang commented Feb 16, 2026

  • Adds pg_synchronized_standby_slots_invalid gauge metric — counts slots listed in the PG17 synchronized_standby_slots GUC that don't exist as physical replication slots
  • Non-zero value means logical replication is blocked and parallel workers may crash
  • Gated to PG 17+ (returns nil on older versions) and primary only (skips standbys via pg_is_in_recovery())
  • Default enabled, zero cost on unsupported versions

Testing

Unit tests cover these cases

`synchronized_standby_slots` must be a subset of replication slots.
However, it's possible to configure postgres so that is not true. This
breaks replication, but worse than that, any queries that use parallel
background workers will fail to execute.

Add a `pg_synchronized_standby_slots_invalid` metric that is `0` when no
slots are invalid, and then the metric size is the number of invalid
slots to track and detect this. Does not include slot name as label/etc
as this could cause unbounded series growth.
@tom-pang tom-pang force-pushed the metric-for-sync-standby-slots-incorrect branch from 994f3aa to f1c6ef2 Compare February 16, 2026 17:41
)

synchronizedStandbySlotsQuery = `
SELECT count(*) AS invalid_count
Copy link
Member

@frouioui frouioui Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also include the name of the offending slot(s), so that we can add it as a label, easier debugging.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think we can just include a query in the runbook I think, don't wanna cause metric explosion here.

@tom-pang tom-pang merged commit 6670bd6 into main Feb 16, 2026
7 checks passed
@tom-pang tom-pang deleted the metric-for-sync-standby-slots-incorrect branch February 16, 2026 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments