-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve starting ipmi_sim program #9378
base: master
Are you sure you want to change the base?
Improve starting ipmi_sim program #9378
Conversation
Improve killing fake_ipmi_host.sh
👋 Hello! Thanks for contributing to our project. If you are unsure the failing tests are related to your code, you can check the "reference jobs". These are jobs that run on a scheduled time with code from master. If they fail for the same reason as your build, it means the tests or the infrastructure are broken. If they do not fail, but yours do, it means it is related to your code. Reference tests: KNOWN ISSUES Sometimes the build can fail when pulling new jar files from download.opensuse.org . This is a known limitation. Given this happens rarely, when it does, all you need to do is rerun the test. Sorry for the inconvenience. For more tips on troubleshooting, see the troubleshooting guide. Happy hacking! |
end | ||
|
||
When(/^the server stops mocking an IPMI host$/) do | ||
get_target('server').run('pkill ipmi_sim') | ||
get_target('server').run('pkill fake_ipmi_host.sh || :') | ||
get_target('server').run("ps aux | grep [f]ake_ipmi_host.sh | awk '{print $2}' | xargs kill") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add here verbose: true, so we can keep track of having duplicated processes.
@@ -580,12 +580,12 @@ | |||
raise ScriptError, 'File injection failed' unless success | |||
end | |||
server.run('chmod +x /etc/ipmi/fake_ipmi_host.sh', verbose: true, check_errors: true) | |||
server.run('ipmi_sim -n < /dev/null > /dev/null &', verbose: true, check_errors: true) | |||
server.run('nohup ipmi_sim -n > /var/log/ipmi_sim.log 2>&1 &', verbose: true, check_errors: true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe before starting a new one, we can have an assert/pre-condition where we assure there is no other process already running.
If we have it, we might want to skip the start with a warning message, or we might want to just fail the step.
end | ||
|
||
When(/^the server stops mocking an IPMI host$/) do | ||
get_target('server').run('pkill ipmi_sim') | ||
get_target('server').run('pkill fake_ipmi_host.sh || :') | ||
get_target('server').run("ps aux | grep [f]ake_ipmi_host.sh | awk '{print $2}' | xargs kill") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would stick with the previous version, there are two issues with it:
- Certain implementations of
ps
might lead to a different output, thus it may require extra handling in the future. || :
is kind of a safety mechanism, if grep won't find this process it may lead to errors.- It involves more moving components:
ps
,grep
,awk
, andxargs
,kill
whereas there you just havepkill
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree in your points @szachovy but @maximenoel8 said that pkill was not killing successfully, and we need to address it somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look in uyuni podman server you will see the pkill is not working:
uyuni-master-podman-srv:~ # ps aux | grep fake_ipmi_host.sh
root 10582 0.0 0.0 4236 3120 ? S Oct16 0:03 /bin/bash /etc/ipmi/fake_ipmi_host.sh
root 11787 0.0 0.0 9156 2336 pts/0 S+ 09:48 0:00 grep --color=auto fake_ipmi_host.sh
root 13152 0.0 0.0 4236 3104 ? S Oct16 0:03 /bin/bash /etc/ipmi/fake_ipmi_host.sh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--full
is missing:
uyuni-master-podman-srv:~ # ps aux | grep fake_ipmi_host.sh
root 10582 0.0 0.0 4236 3120 ? S Oct16 0:03 /bin/bash /etc/ipmi/fake_ipmi_host.sh
root 13152 0.0 0.0 4236 3104 ? S Oct16 0:03 /bin/bash /etc/ipmi/fake_ipmi_host.sh
root 13807 0.0 0.0 8780 2328 pts/1 S+ 10:03 0:00 grep --color=auto fake_ipmi_host.sh
uyuni-master-podman-srv:~ # pkill --full fake_ipmi_host.sh
uyuni-master-podman-srv:~ # ps aux | grep fake_ipmi_host.sh
root 13877 0.0 0.0 8780 816 pts/1 S+ 10:03 0:00 grep --color=auto fake_ipmi_host.sh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, it seems the proper fix for that killing, indeed.
Context
If one of the IPMI Power management features is failing, it will keep
ipmi_sim
process running. The next feature will try to start again thisipmi_sim
process. In this situation, the commandipmi_sim -n < /dev/null > /dev/null &
will be executed and wait for the previous process to stop ( nothing will stop the process ). The command will not return any code and the ssh connection will die. This situation will not be detected by the ssh timeout and the testsuite will stay stuck forever at this stage.What does this PR change?
Use nohup command to start the service. This command will return immediately a result even if the process is already running.
Improve the IPMI mocking cleanup. From my test,
pkill fake_ipmi_host.sh || :
was not killing successfullyfake_ipmi_host
. ( detected by rerunning several time the same features. )GUI diff
No difference.
Documentation
No documentation needed: only internal and user invisible changes
DONE
Test coverage
ℹ️ If a major new functionality is added, it is strongly recommended that tests for the new functionality are added to the Cucumber test suite
No tests: already covered
DONE
Links
Port:
Issue:
Changelogs
Make sure the changelogs entries you are adding are compliant with https://github.com/uyuni-project/uyuni/wiki/Contributing#changelogs and https://github.com/uyuni-project/uyuni/wiki/Contributing#uyuni-projectuyuni-repository
If you don't need a changelog check, please mark this checkbox:
If you uncheck the checkbox after the PR is created, you will need to re-run
changelog_test
(see below)Re-run a test
If you need to re-run a test, please mark the related checkbox, it will be unchecked automatically once it has re-run:
Before you merge
Check How to branch and merge properly!