Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misleading error message interaction between ExecStart= and ExecStartPost= #160

Open
PenelopeFudd opened this issue Jun 22, 2023 · 3 comments

Comments

@PenelopeFudd
Copy link

Hi;

We have an Ansible deployment script that installs this service file:

[Unit]
Description=rabbitmq-server - RabbitMQ broker
After=network.target [email protected]
Wants=network.target [email protected]

[Service]
Type=notify
User=rabbitmq
Group=rabbitmq
UMask=0027
NotifyAccess=all
TimeoutStartSec=3600
LimitNOFILE=32768
Restart=on-failure
RestartSec=10
WorkingDirectory=/var/lib/rabbitmq
ExecStart=/usr/lib/rabbitmq/bin/rabbitmq-server
ExecStartPost=-+/home/application/bin/python3 /usr/local/bin/rabbitmq_detect_msg_store_corruption.py
ExecStop=/usr/lib/rabbitmq/bin/rabbitmqctl shutdown
SuccessExitStatus=69

[Install]
WantedBy=multi-user.target

When we start the service, we get this:

$ sudo systemctl start service rabbitmq-server

Unable to start service rabbitmq-server: ERROR:systemctl: rabbitmq-server.service: Exec command does not exist: (ExecStartPost) /home/application/backend/bin/python3

$ echo $?
1

The error message turned out to be a red herring. Neither the Ansible script nor the service file has been changed in over a year, and the error message has apparently been printed all this time without returning an error code.

The true error turns out to be that we changed a password in rabbitmq's configuration file, and we failed to url-escape it. When ExecStart runs, the server writes an error to a random log file and exits with a non-zero return code.

It would be nice if systemctl had printed

Unable to start service rabbitmq-server: ERROR:systemctl: rabbitmq-server.service: ExecStart command exited with an ExitStatus of 1: (ExecStart) /usr/lib/rabbitmq/bin/rabbitmq-server

Thanks!

@gdraheim
Copy link
Owner

Sadly this is impossible as the docker-systemctl-replacement is not a server that can watch its children. It can not see the returncode of the ExecStart process - it will only detect a "failed" service when that Pid has vanished.

The other thing about supporting "-+" prefix is a different thing however. Currently "+" for "nouser" is ignored, so when python3 is not accessible by user rabbitmq then it fails. This may change in the future.

@PenelopeFudd
Copy link
Author

Ok, good to know.
I had been under the impression that it could see the return value of the exec() call if it exited immediately (daemonized, for instance), just not if the exec() call kept running.

@PenelopeFudd
Copy link
Author

In this case, is it trying to exec() the program +/home/application/bin/python3 and failing?
Wouldn't it be possible to say Path '%s' is not absolute, will not exec(), or if relative paths are allowed, then just Pathname '%s' not found, will not exec()?
That would be helpful whether or not + for nouser is implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants