-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vdsm: get gluster volume info from any gluster peer #412
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the PR was bad formed, and I guess that it's more clean this PR. Here is your question in the other PR, I think this was added years ago because not specifying the server could be a problem. Not sure that the gluster cluster can cope with a case when one of the server is down. Gluster cluster can cope with one of the server down, the issue is triggering because adding the --remote-host to the gluster cli limits the gluster query to this host, avoiding querying to the backup servers. Maybe it's a mechanism to avoid to register not updated info. _Please check git history to understand when and why self.volfileserver was added. In the git history the function was always called with the sef._volfileserver, as you can see here 907560b. Also how did you test this change? Please describe what was tested - how you simulated the issue and tested that this change solves the issue, and how you tested that this change does not cause regressions in other flows. Regarding the tests, I will describe in a new comment if you don't mind. |
@josgutie ok lets continue here, but please do not close PRs and submit new one like this. If there was an issue with the previous PR it can be easily fixed by pushing a new version. |
$ /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "SELECT connection, mount_options FROM storage_server_connections;" $ grep -E "storage|mnt_options" /etc/ovirt-hosted-engine/hosted-engine.conf $ df -h | grep glusterSD |
@nirs suggested testing a new deployment, it will take some time, but I guess that I could check the scenario. |
Test looks good, and sure we want to test the common operations like creating a new storage domain, putting storage domain to maintenance and activating it. But I think you change is not complete - it assumes that So the logic seems to be:
Also not using |
@nirs the gluster command doesn't have any parameter to point to the backup servers, you can review the following client documentation Gluster Command Line Interface. In fact it's hard to find the definition of the --remote-host gluster client parameter. Regarding the suggested change, the problem is if the volfile server is down, the call will fail because --remote-host limits the call to the volfile server and omit calling to the backup servers. We could deal with connection issue in that part of the code, but from my point of view, it's a gluster cli responsibility not vdsm. I will create and destroy a gluster storage domain. |
New tests completed:
The only thing left to test is to perform a full RHHI 1.8 deployment. |
Hi,
In this new deployment I've created a new brick, glusterfs volume and glusterfs storage domain. If you need further details of the tests performed, do not hesitate to ask. |
/ost |
/ost |
8edc354
to
a000ba4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :)
@josgutie linter failed, mind the longer lines. Limit is 79 characters. I was also wondering if you could add a small test that throws an exceptions and triggers the retry with the backup servers. So that at least we add some coverage for the new code. |
4928a83
to
c15b799
Compare
c72cad3
to
de8ff6b
Compare
/ost |
3 similar comments
/ost |
/ost |
/ost |
/ost el8stream basic-suite-master |
/ost basic-suite-master el8stream |
el8stream OST is not working, el9stream should pass |
/ost basic-suite-master el9stream |
/ost |
@nirs do you have any comment on this PR? Otherwise, I will integrate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change will not work in all cases, is little bit messy and need better documentation. See comments on to make it more clear.
Hi Nir, PR is updated, my doubt in on the test coverage, but it should work in both cases, where backup server are defined and not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @josgutie! Looks very clean and safe change. The test can be renamed and documented.
The function _get_gluster_volinfo query the glusterfs volume info the the storage server, this is translated to the gluster client adding the parameter --remote-host which limits the query to one server, so we are converting the storage server as a single point of failure, if it is not available, it can led to cluster outtage. The proposed changed let the cluster cli to use any available gluster peer. Signed-off-by: José Enrique Gutiérrez Mazón <[email protected]>
/ost |
/ost |
1 similar comment
/ost |
The function _get_gluster_volinfo query the glusterfs volume info the the storage server, this is translated to the gluster client adding the parameter --remote-host which limits the query to one server, so we are converting the storage server as a single point of failure, if it is not available, it can led to cluster outtage. The proposed changed let the cluster cli to use any available gluster peer.