Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: New package for is-online checks #962

Open
7 tasks
engelmi opened this issue Oct 15, 2024 · 0 comments
Open
7 tasks

RFE: New package for is-online checks #962

engelmi opened this issue Oct 15, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request jira Issues that are synced to Jira

Comments

@engelmi
Copy link
Member

engelmi commented Oct 15, 2024

This RFE resulted from #858 and #954 and is a continuation for these.

Please describe what you would like to see

It would be great to have a new, dedicated package providing - which can optionally be installed - that provides the following features:

  • A small CLI utility program for checking the connection state of the Agent|Node|System with the simple semantic of using exit code 0 for the component being online and 1 for offline. For example:
$ bluechi-is-online --help
bluechi-is-online [agent|node|system] [OPTIONS]
If online, exit with 0. Otherwise 1.

Options:
--monitor: keeps monitoring as long as agent|node|system is online. Will only exit if offline detected. 
--initial-wait: in seconds. If not online, then monitor n seconds. 
  • Wrapping systemd unit(s) using that CLI program to enable defining systemd dependencies on the connection state. The systemd unit state semantics would be:
    • unit is active = agent|node|system is online
    • unit is inactive|failed = agent|node|system is offline

For example:

[Unit]
UpheldBy=bluechi-agent.service
...
[Service]
# will keep it in "activating" state. if it fails, will be restarted by bluechi-agent.service
ExecStartPre=/usr/bin/bluechi-is-online --initial-wait=2
ExecStart=/usr/bin/bluechi-is-online --monitor

Please describe the solution you'd like

Both, the CLI tool and the provided systemd unit(s), would be provided in a new, optionally installable package (e.g. bluechi-is-online) which recommends the bluechi-controller and bluechi-agent package.

  • Create new package in src directory and implement CLI tool for checking/monitoring the connection status of the Agent|Node|System
  • Create the wrapping systemd unit(s)
  • Create new package in the rpm spec (name could be bluechi-is-online)
  • Update documentation
    • Add man pages for new CLI tool
    • Update readthedocs and add new section to with example usage
  • Implement unit|integration tests

Please describe your use case

Quoting from #954:

When the network connection is lost, the current agent behavior is to disconnect and keep trying to reconnect. I would like to add another agent behavior where the agent terminates with a failure. Then we can expect the following.
...
There is a requirement that when a node running bluechi-agent is disconnected, running units of the node should be terminated, because they will be re-executed on another node.

Quoting from #858:

However, currently it is not possible to write systemd services that run as soon as one or all bluechi-agents are connected on the controller machine (the same applies for the agent on its machine)
...
it is possible to use these targets as a synchronization point for other systemd units.

@engelmi engelmi added enhancement New feature or request backlog This is next up in priority labels Oct 15, 2024
@engelmi engelmi added jira Issues that are synced to Jira and removed backlog This is next up in priority labels Oct 17, 2024
@engelmi engelmi self-assigned this Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request jira Issues that are synced to Jira
Projects
None yet
Development

No branches or pull requests

1 participant