-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
privacy #2
Comments
yes definitely. |
As you can see from the commit log in this repo, I have already folded in some of your work. Do you know what it would take to get "privacy" to work? Even better, would you be willing to contribute :) ? |
There are three more new PRs from today: bdrosen96/libhdfs3#18 They really should have just been one PR. |
What do these ones do, @bdrosen96 ? |
This fixes a bug with SASL negotiation when protection setting is privacy or integrity for kerberos/GSSAPI. The odd thing is that this only seems to reproduce on some clusters so it was not noticed before. In addition, I also added two new session config variables to allow a user to manually specify RPC protection and data transfer protection rather than just getting it via SASL in case that logic has further issues. I will have one more follow up PR, likely tomorrow (#21) that with fix the new config variable to use the string names used by hadoop instead of the SASL QOPS number equivalent. As for testing 3, I can try to do so, assuming I can easily build it using same process as I do now in our internal tests, but I'm not sure I will definitely be able to get to it tomorrow. Which changes do you think already exist in HAWQ? |
Well I'm glad that at least you have this in production, to pick up those bugs for as many configurations as possible. I think HAWQ implemented the KMS stuff, or at least these is some code touching that, and I haven't yet looked at your PRs to see if they amount to the same or not. Side question: do you know if the SASL implementation can be called to wrap/unwrap general bytes for encrypting yarn RPC streams? I would like to be able to command yarn from python directly... Or, alternatively, there are various SASL implementations in python (or wrappers), but I have no idea whether they implement the more complicated security layers. Any insight would be very useful. |
The two new config vars are: hadoop.rpc.protection and with last PR that went in this morning, they mean same thing as the XML hadoop vars of the same name and can have same values (privacy, authentication or integrity) I'm not sure how the SASL implementation will work with yarn RPC since I have never tried to do so. I would not be surprised if it would work with it, assuming YARN uses similiar RPC as name node. Not sure that yarn has as many options for configuring RPC though. |
Excellent stuff, looks good. Do you intend to work further on this? I wonder, is https://github.com/martindurant/libhdfs3-downstream now a good place for you to work against? It might be convenient for you to be nearer the upstream. As far as I know, YARN RPC uses exactly the same mechanisms as HDFS, even being defined by the same |
I don't have any current plans to make more changes unless/until a new issue may come up. Once all the needed changes are in the downstream repo and it has been confirmed to work with all the various cases my current repo does, then I could possibly switch. That probably explains why hadoop.rpc.protection is in core-site.xml and not hdfs-site.xml. |
@bdrosen96 : as expected, I am having trouble merging more of your code into the current HAWQ-based version, because it implements KMS, but in a different way to your code. For a trivial change, at https://github.com/ContinuumIO/libhdfs3-downstream/blob/master/libhdfs3/src/client/OutputStreamImpl.cpp#L255 it uses Do you have any recommendations for moving forward? It is possible to pull only the "private"-related material out of your code and trust that HAWQ's KMS stuff is right? I care much more about the former than the latter. Any other idea for how to consolidate would be appreciated. Unfortunately, I am no star c++ coder; just applying only your changes post-KMS is proving no simpler, since the base code is now different in places... |
From what I can see, the version of KMS supported in HAWQ may have some functionality gaps. 1 It does not support kerberos at all, where my version does. 2 It does not support specifying KMS auth type. 3 It does not support tokens. 4 I don't think it supports the Base64 variant: std::replace(encoded.begin(), encoded.end(), '+', '-'); on encrypt or decrypt: int rem = data.length() % 4; I have not yet looked at how the kms itself is integrated into the encrypt/decrypt portions. |
Would you, then, advocate working from your branch's tip and integrating whatever is needed on the HAWQ end, rather than the other way around as I've been trying? |
For point 4, looks like this is actually partially handled, just in CryptoCodec.cpp. As for encoding, I see that in one place, but I don't yet see decode and I think there is at least one other place that might need encoding. I would suggest first intergrating existing PR (without KMS) plus the new PRs from last week as one unit into the code base. Then start with new PR to handle the KMS/HAWQ integration. Most likely way to try to handle this would be to first to try to add token and kerberos SPNEGO support for existing HAWQ KMS as one PR. |
Interesting thing is it looks like the HAWQ kms stuff may have copied mine to some extent (or we both copied from same reference). If you look at HttpClient.cpp you see we have same macros for CURL. |
Although they do seem to have lost CURLOPT_SSL_VERIFYPEER, option which is important to support self signed certificates which I think we want to support. |
I don't think I can merge the non-KMS PRs of yours, that's the problem - or at least, I'm unsure now how much knock-on code change there would be. For instance, the following single diff:
doesn't match the current def of
|
I'm confused. I only see one version of createBlockOutputStream for both versions. |
Now I'm confused too - let me get back to you on that! |
OK, so I don't know where that came from, but you still have things like this: https://github.com/Pivotal-Data-Attic/pivotalrd-libhdfs3/compare/ae2e980821066030f29d4d3ee1cafb3eab3fface...Pivotal-Data-Attic:a366a8b#diff-51b855d2105da1e2e82c3f52a31df6cbL247 where the current line has no reference to encryption (but maybe it should have!), but the diff adds protection after encryption parameters. |
I did not do a lot of the work for supporting the read short circuit stuff, because I don't use it (mostly due to restrictions) and because it seems to be unclear whether it is HDFS-2246 or HDFS-347 based. |
I don't think it being missing in read short circuit should block it from merging, so long as it will compile and work at least as well as it did previously. I'm not even sure that local read support encryption and secure mode and kms anyway. |
This may be the missing commits 993a2b6 , I will try to build and test, although I only have non-secured HDFS right now. |
It does compile and the hdfs3 test suite passes |
So we just need: bdrosen96/libhdfs3#18 and then that leaves only the KMS? |
I think those are in now, or are you saying I missed something? |
Ah, looks like I missed seeing those commits added. I do see one thing in SessionConfig.h in the getLogSeverity. Is that something you did? Also, If I get a chance tomorrow I'm going to try to make a private repo with some docker image stuff that can be used to run hadoop in various modes as that might help with testing. |
That would be very useful. Here is the image I use with python dev environment, hdfs and yarn, but no security. |
The issue with getLogSeverity is that the PR seems to have lost the ++i to be just i which I don't believe is correct. |
Just sent you an invite to the repo I set up for this. I spent a couple of hours reworking and simplifying some existing testing framework that I use, so there might be some bugs. The script run_hadoop.sh in that repo should support secure and insecure and allow it to vary various rpc and data settings that are relevant to HDFS3. It will bring up a cluster which uses hostnames inside a private domain that can be accessed from the host machine, but might use slightly non standard ports. (ie 9000 vs 8020) There are also docker images for HA insecure and HA secure, but the script does not currently support them. Adding support would likely not be too hard, but I don't think it is needed for most of the testing cases. |
Anyway I can turn on extra debug info to figure out the auth problem, i.e,. why it is trying to connect as a proxy user? |
According to the HDFS nameserver logs, when you use the hdfs CLI, you get an auth:KERBEROS followed by a client auth:KERBEROS both for the principal of the logged-in user. libhdfs3 as it stands, does a successful auth:KERBEROS, followed by an attempted auth:KERBEROS via auth:PROXY , where the user principal is being used in the proxy. This is not normally allowed, only service accounts (like hdfs itself) are allowed to act on behalf of other users. |
Interesting. That should allow you to test possible fixes for this. We definitely do not want the principal to be in the proxy (not with the @realm) . I had tried a simple scala test case to try to see what the behavior was for UGI code, etc and to try to mimic this in C++, but that made things worse, not better. |
Hi, I have build this version of libhdfs3, use it to access hdfs, but when the privacy is set in both client and server, I get error log from namenode below, it seem that, when doing the handshake2, the client still select the AUTH QOP. DEBUG org.apache.hadoop.ipc.Server: javax.security.sasl.SaslException: Problem with callback handler [Caused by javax.security.sasl.SaslException: Client selected unsupported protection: 1] |
Did you also use the modified version of libgsasl: https://github.com/bdrosen96/libgsasl ? |
Thank you, I will try it now. Currently I am using official gsasl 1.8.0 |
@bdrosen96 Hi, I have spent lot time building the gsasl1.8.1, try this libhdfs3 again and still get the same error. The hadoop version is 2.9.2, parameter of hadoop.rpc.protection is set to privacy in core-site.xml and hdfs-client.xml. The first byte of token received by sasl server in dohandshake2 is still 1, which represent auth method and cause error. Is there other things I miss, like some other special library? |
@bdrosen96 My fault, I haven't installed the gsasl1.8.1 properly, after correcting it and it works fine. Thank you very much. |
@Librago , please document exactly what you did so that it can be useful to others. |
@martindurant Of course, I am testing the apache hawq database, but the libhdfs3 client can not work with namenode when hadoop.rpc.protection is set to privacy. So I replace it with this downstream version. There are three problems I have confronted which may be useful to others. First, the gsasl_step return GSASL_GSSAPI_INIT_SEC_CONTEXT_ERROR, namenode log have nothing about it: this is because I have set an entry 127.0.0.1 localhost in /etc/hosts. It turn the principal name of kerberos from username@hostname to localhost@hostname which is not registered in KDC. After change the entry to 127.0.0.1 localhost, it works fine. Second, the namenode report "Client selected unsupported protection: 1", which I asked above, this is because the current official gsasl 1.8.0 is not support the privacy QOP. This repo is depended on libgsasl. Then I change the gsasl version as bdrosen96 suggested. Third, the libgsasl 1.8.1 is depended on openssl1.0.2 and my version is openssl1.1.1 which is used in many other places. The encryption interface called by digest-md5-encode and decode is different in these version. Then I changed those interface in libgsasl1.8.1 and solved it. If you are fine with openssl1.0.2, it will be easy to change it. Last, there are some other things worth to be metioned, some parameter like dfs.block.access.token.enable, hadoop.rpc.protection, dfs.data.transfer.protection is need to be set. |
So not simple, then... You may want to wrap this into a script somehow and post it. |
Sorry, I didn't notice your last reply, and I have confronted another problem. When the parameter in core-site.xml of hadoop is set to authentication+privacy, then this client can not work, the call stack and error is below, HdfsRpcException: RPC channel to "testcirpm33kerberosbasictest3589229-1:9000" got protocol mismatch: RPC channel cannot find pending call: id = -33. Do you know the reason? @martindurant @bdrosen96 |
Totally beyond me, sorry. |
Possible values are authentication, integrity and privacy. authentication means authentication only and no integrity or privacy; integrity implies authentication and integrity are enabled; and privacy implies all of authentication, integrity and privacy are enabled. So if you previously had privacy working, that should have included everything |
Yes, thanks, it should be work, this parameter should only affect the supported QOP choice responded by the SASL server. It should be other problems there. |
Hi, @bdrosen96 , I meet same issue with ghost. And find the libhdfs3 in https://github.com/erikmuttersbach/libhdfs3 don't support hadoop.rpc.protection=PRIVACY, it don't process the rpc response with wrap saslState. |
I have integrated Apache HAWQ LibHdfs3 with my application and during testing I found that the KMS Client Provider in that repository doesn't support Kerberos based Authentication. If I understand it right, this repo seems to have support for this specific use case. Can someone please confirm that this is indeed the case. Also, is there any plan to merge changes in this repo back to the Apache HAWQ repo? |
I have not done anything with this stuff in several years, so I would have a hard time debugging things. There was an effort to get this merged into HAWQ back in 2017, but that seemed to have stalled before everything got in and I don't recall the current status or where it stalled out. I think some context may be in the comment history here - not sure if it has links to the PRs that did not make it in |
@kgundamaraju : Kerberos authentication was indeed known to work in one of these variants, but I'm no longer sure which one. It's PRIVACY mode that was the tricky thing (see the discussion above) - and since you mention KMS, I assume that's actually what you'll want in the end. Can you use webHDFS perhaps? It proved too hard to match all the use cases here, when the reference implementation is poorly documented java. Since HDFS is available as a java JNI, and this was exposed by pyarrow for python, usage of libhdfs3 dropped off, and that's why this repo is dormant. Even fsspec no longer uses hdfs3, which was the very first fsspec-style implementation. |
My sincere thanks to @bdrosen96 and @martindurant for your prompt responses. @martindurant , as you have correctly pointed out, the use case that I'd like to support is for my application to communicate with Hadoop KMS with Kerberos Authentication. This, if I understand it correctly, would require LibHdfs3 to support Kerberos HTTP SPNEGO Authentication, which currently it doesn't, not at least in the Apache HAWQ repository that I downloaded Libhdfs3 from. I also entered a defect in the Apache HAWQ JIRA Database (https://issues.apache.org/jira/browse/HAWQ-1791?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=17318956#comment-17318956) and I did get the confirmation that there is indeed no support for this specific use case in the LibHdfs3 in the Apache HAWQ repo. The reason why I am trying to use LibHdfs3 is because I found it to be much more performant than the Java JNI based libHdfs. Before I give up on this effort, could you please comment on whether the Kerberos HTTP SPNEGO Authentication was ever added to this downstream repo? If yes, then I would like to spend some time trying to understand how I can merge these changes as I have already invested a lot of time in trying to integrate LibHdfs3 with my application and testing Kerberos support to communicate with the Hadoop cluster itself. The only use case that is currently not working for me is the communication with the Hadoop KMS with Kerberos Authentication. Thanks in advance. |
It's been too long, I don't remember for sure. Yes, Kerberos authentication worked, but I suspect it was not HTTP SPNEGO (use that for HTTP, like webHDFS!), but direct connections with kinit, etc. Authentication and secure encrypted communication are not the same thing. From my vantage, libhdfs3 was only used with python hdfs3, so that was probably not the performance boost you were after anyway. |
HTTP SPNEGO was used for communication with KMS I believe. I added it to https://github.com/bdrosen96/libhdfs3 in this PR primarily: https://github.com/bdrosen96/libhdfs3/pull/15/files I do not recall the current status of that code with respect to bringing it into HAWQ though. Hopefully the info in the linked PR (as well as the other PRs in the linked repo which set the state for that or which fixed bugs after that) will be enough info to help you try to make sense of what might be required. |
Thanks again @martindurant and @bdrosen96. As @bdrosen96 has stated, the use case I was interested in also required making HTTP SPNEGO protocol, which is used between the HDFS Client and the Hadoop KMS, to work as well. @bdrosen96 , I will download this repo and try to figure out what portion of this code has been merged into the Apache HAWQ repo and which portion of this is missing. Many thanks for pointing me to this repo as well as the PR. |
@bdrosen96, great appreciation to your work! libhdfs3 in clickhouse encountered the same problem(doesn't support hadoop.rpc.protection=privacy). Could you kindly contribute your code to https://github.com/ClickHouse/libhdfs3.git, or allow me to merge your code into ClickHouse/libhdfs3? There're several steps according to my understanding: 1. update clickhouse's libgsasl as yours; 2.cherry pick your PR into libhdfs3 of clickhouse. 3. modify clickhouse's cmake. Step 2 may be a challenge to me. I'm not sure how many PRs should be merged into clickhouse/libhdfs3. I think those PRs should kick off KMS related. Am I right? |
Hi ghost! I've also encountered the same error in clickhouse/libhdfs3. Just as you described, I have also set 'hadoop.rpc.protection' in the 'core-site.xml' configuration file of my Hadoop environment to 'authentication+privacy,' but I don't actually know what it means. So, could you please tell me how I can quickly resolve this issue? |
Unfortunately no, as you can see from the long thread. The short answer may be: use pyarrow, which is now the default hdfs backend within fsspec. |
Are you suggesting that I should give up using ClickHouse to access HDFS? |
I can't speak for Clickhouse, but this repo no longer aspires to allow "privacy" mode and is no longer being developed. |
Thank you very much for your patient response. I believe I should focus more on the issue of connecting ClickHouse with HDFS rather than the issue with this repository. |
@bdrosen96 , would any of your work allow RCP "privacy" mode (
"hadoop.rpc.protection": "privacy"}
) to be enabled?The text was updated successfully, but these errors were encountered: