Skip to content

Conversation

@MasterPtato
Copy link
Contributor

No description provided.

@railway-app
Copy link

railway-app bot commented Jan 16, 2026

🚅 Deployed to the rivet-pr-3927 environment in rivet-frontend

Service Status Web Updated (UTC)
website 😴 Sleeping (View Logs) Web Jan 16, 2026 at 2:39 am
frontend-inspector 😴 Sleeping (View Logs) Web Jan 16, 2026 at 2:39 am
frontend-cloud 😴 Sleeping (View Logs) Web Jan 16, 2026 at 2:38 am

@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-3927 January 16, 2026 00:15 Destroyed
Copy link
Contributor Author

MasterPtato commented Jan 16, 2026


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Jan 16, 2026

Pull Request Review: fix: day 1 patch, add dashboards

Summary

This PR includes a collection of bug fixes and adds two new Grafana dashboards (Epoxy and Pegboard). The changes are mostly good, with a few minor issues and observations.


Code Quality & Best Practices

✅ Good Patterns Observed

  1. Import organization (api-public/src/actors/kv_get.rs:5): Moving the Json import to the top-level imports is cleaner and follows the codebase convention of keeping all imports at the top of the file.

  2. Metrics consistency (guard/src/routing/mod.rs:130,147): Fixing metric label casing from "GATEWAY" and "API" to lowercase "gateway" and "api" improves consistency across metrics.

  3. Dashboard additions: New monitoring dashboards for Epoxy and Pegboard will improve observability.


Potential Issues & Concerns

⚠️ Code Issues

  1. Missing metric decrement before early returns (pegboard-runner/src/lib.rs:252-258)

    • Location: The CONNECTION_ACTIVE.dec() metric is only decremented at the end of the function (line 252), but there are several early error returns that bypass this code.
    • Problem: If update_alloc_res fails or other errors occur before line 252, the metric will never be decremented, leading to incorrect active connection counts.
    • Fix: The metric increment at line 138-144 should be paired with a decrement that is guaranteed to run (e.g., using a RAII guard pattern or ensuring it runs in all code paths).
    • Severity: Medium - This will cause metric drift over time, making monitoring unreliable.
  2. Unused parameter warning (gasoline/src/db/kv/mod.rs:623)

    • Issue: The parameter name changed from _worker_id to worker_id, but the diff doesn't show where this parameter is actually being used.
    • Question: Is worker_id actually used in the function body, or should it remain prefixed with underscore to avoid compiler warnings?
    • Severity: Minor - May cause compiler warnings if unused.
  3. Breaking change to metric name (gasoline/src/db/kv/mod.rs:226,1010)

    • Issue: SignalPending is renamed to SignalPending2.
    • Concern: This is a breaking change for any existing Grafana dashboards, alerts, or monitoring tools that reference the old metric name.
    • Question: Is this intentional? If so, consider:
      • Documenting the migration path
      • Updating any dependent dashboards
      • Consider keeping both metrics temporarily for backwards compatibility
    • Severity: Medium - Could break existing monitoring infrastructure.

Performance Considerations

No significant performance issues identified. The changes are mostly small fixes and additions.


Security Concerns

No security issues identified in this PR.


Test Coverage

⚠️ No test changes included: This PR makes functional changes but doesn't include any test updates or additions. Consider:

  • Testing the metric increment/decrement logic in pegboard-runner
  • Testing the renamed metric in gasoline
  • Integration tests for the routing label fixes

Recommendations

High Priority

  1. Fix metric decrement in pegboard-runner: Ensure CONNECTION_ACTIVE is decremented in all code paths, not just the happy path.

Medium Priority

  1. Clarify SignalPending2 rename: Document why this breaking change was made and update any dependent systems.
  2. Review worker_id usage: Verify the parameter is actually used, or revert to _worker_id.

Low Priority

  1. Add tests: Include test coverage for the behavioral changes.
  2. Grafana dashboard updates: Ensure existing dashboards are updated for the SignalPending2 metric name change.

Files Reviewed

✅ engine/packages/api-public/src/actors/kv_get.rs
✅ engine/packages/gasoline/src/db/kv/mod.rs
✅ engine/packages/gasoline/src/metrics.rs
✅ engine/packages/guard/src/routing/mod.rs
✅ engine/packages/pegboard-runner/src/lib.rs
📊 engine/docker/template/grafana-dashboards/epoxy.json
📊 engine/docker/template/grafana-dashboards/pegboard.json
📊 engine/docker/template/grafana-dashboards/gasoline.json (updated)
📊 engine/docker/template/grafana-dashboards/guard.json (updated)


Overall Assessment

This is a reasonable collection of bug fixes and monitoring improvements. The main concern is the metric decrement issue in pegboard-runner that could lead to incorrect metrics. The SignalPending2 rename also needs clarification to ensure it doesn't break existing monitoring.

Recommendation: Address the metric decrement issue before merging, and clarify/document the SignalPending2 change.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 16, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3927

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3927

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3927

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3927

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3927

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3927

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3927

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@3927

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3927

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3927

commit: ee27a16

Comment on lines +117 to +119
"expr": "sum by (rivet_datacenter) (rate(rivet_epoxy_proposals_total{rivet_project=~\"$project\",rivet_datacenter=~\"$datacenter\"}[$__rate_interval]))",
"instant": false,
"legendFormat": "{{workflow_name}}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query groups by rivet_datacenter but legendFormat references {{workflow_name}} which doesn't exist in the query result. This will cause the legend to display incorrectly or be empty.

Fix:

"legendFormat": "{{rivet_datacenter}}"

Spotted by Graphite Agent

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Comment on lines +302 to +304
"expr": "sum by (rivet_datacenter) (rate(rivet_epoxy_pre_accept_total{rivet_project=~\"$project\",rivet_datacenter=~\"$datacenter\"}[$__rate_interval]))",
"instant": false,
"legendFormat": "{{workflow_name}}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query groups by rivet_datacenter but legendFormat uses {{workflow_name}}. The label in the legend template must match the grouping field.

Fix:

"legendFormat": "{{rivet_datacenter}}"

Spotted by Graphite Agent

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Comment on lines +406 to +408
"expr": "sum by (rivet_datacenter) (rate(rivet_epoxy_accept_total{rivet_project=~\"$project\",rivet_datacenter=~\"$datacenter\"}[$__rate_interval]))",
"instant": false,
"legendFormat": "{{workflow_name}}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mismatch between grouping (rivet_datacenter) and legendFormat ({{workflow_name}}). This pattern repeats throughout the dashboard and will result in incorrect or missing legend labels.

Fix:

"legendFormat": "{{rivet_datacenter}}"

Spotted by Graphite Agent

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants