feat: add a startup field for checks, and start-checks and stop-checks actions #560

tonyandrewmeyer · 2025-01-16T07:33:05Z

This is not at all ready - the PR gives me a convenient place to put notes.

Add support for starting and stopping checks, in a very similar manner to starting and stopping services.

Layer

The layer specification gains a new startup field for checks, which must be either enabled or disabled, and defaults to enabled for backwards compatibility.

The impact of this field is that when the plan is updated (including when Pebble first starts) or a layer is added, new checks where startup is set to disabled will not be started. If a check's configuration changes (for example, the timeout or period), and it is set to startup: disabled but is currently running, then it will be restarted as previously. Any stopped checks with startup: enabled will also be started when the plan changes.

This field is implemented via the existing check manager PlanChanged function. That function is registered as a listener for when the plan changes, which includes new checks being added, and check fields (including startup) changing. The function is adjusted in this PR to only start new checks if startup is enabled. For replan write details here.

CLI

Minor change: since replan now impacts checks as well as services, it is moved in the help listing from the services section to the plan section.

Two new commands are added: start-checks and stop-checks. These take one or more check names as arguments, and start or stop the specified checks. Starting a running check or stopping a stopped check is a no-op.

The checks command output is altered to include a new "Startup" column, which is simply the value of the field from the combined plan for that check, and to show inactive in the "Status" column for any checks that have been stopped (these also get a - value for both "Change" and "Failures").

start-checks and stop-checks are very similar to start and stop, except that they are synchronous so do not offer wait. Checks currently have exactly one change (this PR changes that to zero or one), with the appropriate tasks contained within that change. Since start-check and stop-check can operate on more than one check, that means that they are operating on more than one change, so an interface where a single change ID is returned (and then can be waited on) does not fit.

We could only offer start-check and stop-check, but in the spec review we preferred to have the option to specify multiple checks (as with services). For services, a new change is created that contains the tasks to start or stop all of the services (in the appropriate order) - doing this for checks does not seem to make sense, given that there is already a change running for each check. We could have a new response type that can wait for multiple changes, but that seems like a lot just for this small feature.

API

The existing /v1/checks endpoint is extended to also support POST, requiring admin access. The structure for this endpoint is similar to posting to /v1/services, in that it requires a list of check names and an action to perform (currently either start or stop).

TODO

Spec - OP052 (internal link only, sorry).

…eed, and it wasn't in the spec.

…d plan doesn't gain a lot of 'startup: enabled' lines.

…tway through.

tonyandrewmeyer · 2025-01-20T02:25:50Z

@benhoyt as discussed, if you could give this a "pre-review", that would be great, thanks!

tonyandrewmeyer added 18 commits January 16, 2025 19:17

WiP

9337a8c

Fix older wording.

138c087

Remove module that's not required any more.

692be67

go fmt ./...

1d1744c

Merge branch 'master' into disable-checks

45872a7

Add initial documentation.

f4a7a88

Fix merge.

a22b1ef

Ignore inactive checks in health calculation.

e76435c

Clarify behaviour when a layer is added.

85021c2

Adjust error messages as per discussion in canonical#557

c0d483e

Get rid of autostart for checks - there doesn't really seem to be a n…

d927fb1

…eed, and it wasn't in the spec.

Small doc improvement.

d565153

Use a more consistent error message.

708879e

Explain how the CLI reference docs are updated.

bfc6a7f

Update the help reference documentation.

131eaae

Put the default startup behaviour in the manager, so that the exporte…

59365b5

…d plan doesn't gain a lot of 'startup: enabled' lines.

Check that all the checks exist upfront, rather than erroring out par…

cf7174d

…tway through.

Fix existing tests.

239443d

tonyandrewmeyer requested a review from benhoyt January 20, 2025 02:25

Add a temporary solution to starting checks in a replan.

5e641a2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add a startup field for checks, and start-checks and stop-checks actions #560

feat: add a startup field for checks, and start-checks and stop-checks actions #560

tonyandrewmeyer commented Jan 16, 2025 •

edited

Loading

tonyandrewmeyer commented Jan 20, 2025

feat: add a startup field for checks, and start-checks and stop-checks actions #560

Are you sure you want to change the base?

feat: add a startup field for checks, and start-checks and stop-checks actions #560

Conversation

tonyandrewmeyer commented Jan 16, 2025 • edited Loading

Layer

CLI

API

TODO

tonyandrewmeyer commented Jan 20, 2025

tonyandrewmeyer commented Jan 16, 2025 •

edited

Loading