Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement using system root trust ca for TLS server authentication for XDS #11470

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

kannanjgithub
Copy link
Contributor

No description provided.

@@ -46,7 +44,7 @@ final class CertProviderClientSslContextProvider extends CertProviderSslContextP
node,
certProviders,
certInstance,
checkNotNull(rootCertInstance, "Client SSL requires rootCertInstance"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CertProviderSslContextProvider.isMtls() can return the wrong value when rootCertInstance is null but the system root certs should be used. I don't know what that is used for, but have you check through the class to make sure the null is handled correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are usages of isMtls() /isClientSideTls()/isServerSideTls() for updating the SslContext.
The usage if isMtls() in server cert provider for setting the trust manager with the ca root certs won't be impacted because the change is only on the client side. In any case I think if (savedTrustedRoots != null) would do the same work as the isMtls() check here.

The usage of isMtls() in client cert provider for setting the keyManager is impacted by the change, and is rather done better with the check if (savedKey != null) instead of if (isMtls()).

The usages in the base class CertProviderSslContextProvider of isMtls() / isClientSideTls() / isServerSideTls() is very confusing to me, since both the client side and server side providers inherit from it (i.e. since a connection has a Xds client and a Xds server, will calling any of these methods on either party should return the same result, it doesn't look like it). Each of these usages also have a further nested check based on savedKey or savedTrustedRoots, I think the inner if conditions can just be used by themselves without the check first for sMtls() / isClientSideTls() / isServerSideTls(), unless those checks achieve anything meaningful.

Copy link
Contributor Author

@kannanjgithub kannanjgithub Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the usages in the base class CertProviderSslContextProvider that I mentioned above can we use only the inner if conditions nested under each of the isMtls() / isClientSideTls() / isServerSideTls() calls?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't be impacted because the change is only on the client side

Well, that's assuming we notice the issue if/when we add server support.

It looks like we should get rid of isMtls() and instead have new methods to check if keys or cert validation is configured, to be used as necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized the trio of methods isMtls() / isClientSideTls() / isServerSideTls() help do 2 things -

  1. When either the key or trusted roots get updated, instead of just calling updateSslContext with that value, it is used to first check whether either or both of key / trusted roots as applicable is not null before calling updateSslContext.
  2. And in the client side, if Mtls it will add client certificate to the context, and on the server side if Mtls it will add trusted roots to the context.

I think the easy way and the only thing needed is to augment isMtls() to

protected final boolean isMtls() {
    return certInstance != null && (rootCertInstance != null || isUsingSystemRootCerts;
  }

I have also introduced a test for Mtls with system roots for the client: XdsSecurityClientServerTest::tlsClientServer_useSystemRootCerts_requireClientAuth.

certificateValidationContextdationContext));
SslContextBuilder sslContextBuilder = GrpcSslContexts.forClient();
if (rootCertInstance != null
&& !tlsContext.getCommonTlsContext().getValidationContext().hasSystemRootCerts()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is right. ca_certificate_provider_instance is supposed to be preferred over system_root_certs. From the gRFC:

We have added a system_root_certs field to the xDS CertificateValidationContext message. In the gRPC client, if this field is present and the ca_certificate_provider_instance field is unset, system root certificates will be used for validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The introduction of the system_root_certs field then seems redundant as the presence or absence of ca_certificate_provider_instance itself will decide whether the system root certs get used. For using system root certs we have to do nothing here w.r.t. setting trust manager in the SSLContextBuilder.

Copy link
Contributor Author

@kannanjgithub kannanjgithub Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the 2nd condition now with the understanding that the new boolean field system_root_certs may be redundant for Java but other implementations may still have a use for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClientSslContextProviderFactory has a check asserting that at least on of the fields is set. So it looks like that verification is interacting with the condition here. I'm not at all wild about how all this is plumbed; we should convert the tlsContext/CommonTlsContext into our own internal data model. But this is fine for now.

… regardless of the value of the new boolean field for system root certs.
savedTrustedRoots.toArray(new X509Certificate[0]),
certificateValidationContextdationContext));
SslContextBuilder sslContextBuilder = GrpcSslContexts.forClient();
if (rootCertInstance != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have a comment describing that null rootCertInstance implies hasSystemRootCerts because of ClientSslContextProviderFactory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -48,8 +48,9 @@ public SslContextProvider create(UpstreamTlsContext upstreamTlsContext) {
checkNotNull(
upstreamTlsContext.getCommonTlsContext(),
"upstreamTlsContext should have CommonTlsContext");
if (CommonTlsContextUtil.hasCertProviderInstance(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move this check into certProviderClientSslContextProviderFactory.getProvider(). That makes it more obvious what is going on and avoids deleting testProviderForClient_rootInstanceNull_expectError(). It'd be fine to move the checkNotNull checks to getProvider() as well.

(Although it looks like upstreamTlsContext.getCommonTlsContext() can never be null (except in tests), because proto won't return a null message, so UpstreamTlsContext.fromEnvoyProtoUpstreamTlsContext() will never pass null to the UpstreamTlsContext constructor. I see BaseTlsContext has the field mark @Nullable, but it looks to never be null. We could potentially add a checkNotNull to BaseTlsContext instead of checking for null here.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

/** Use system root ca cert for TLS channel - no mTLS. */
@Test
public void tlsClientServer_useSystemRootCerts() throws Exception {
System.setProperty( "javax.net.ssl.trustStore", getCacertFilePathForTestCa());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These need to be reverted at the end of the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Map<String, String> env = new HashMap<>();
env.put("create", "true");
FileSystem fileSystem = FileSystems.newFileSystem(uri, env);
Path cacertsFile = Paths.get("./cacerts-testca.jks");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely don't create files relative to the current directory. Use the temp file creation API instead (e.g., File.createTempFile()) or TemporaryFolder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

private String getCacertFilePathForTestCa() throws URISyntaxException, IOException {
final URI uri = getClass().getResource("/certs/cacerts-testca.jks").toURI();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add a random file like this that has no documentation as to what's in it and how it was made. Create the KeyStore on-demand instance. We do that multiple places, but this is fine to compare to. You'd use keyStore.store() after initializing the object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -220,6 +220,25 @@ private static CommonTlsContext buildCommonTlsContextForCertProviderInstance(
return builder.build();
}

@SuppressWarnings("deprecation")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add a new test helpers using the old deprecated field. Use setTlsCertificateProviderInstance instead of setTlsCertificateCertificateProviderInstance. It looks like buildNewCommonTlsContextForCertProviderInstance() does it the non-deprecated way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

setBootstrapInfoAndBuildDownstreamTlsContext(null, null, null, null, false, false);
buildServerWithTlsContext(downstreamTlsContext);

// for TLS, client only needs trustCa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment looks to be because false was passed for hasIdentityCert. You copied it, but it doesn't seem applicable with how this is being called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (fakeNameResolverFactory != null) {
NameResolverRegistry.getDefaultRegistry().deregister(fakeNameResolverFactory);
}
if (testName.getMethodName().equals("tlsClientServer_useSystemRootCerts")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please no. This is unlikely to be noticed, is trivial to break, and is a huge code smell. Set a File field to non-null and check it here, use try-catch to delete it in the test, TemporaryFolder, or File.deleteOnExit().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

upstreamTlsContext.getCommonTlsContext())) {
if (CommonTlsContextUtil.hasCertProviderInstance(upstreamTlsContext.getCommonTlsContext())
|| upstreamTlsContext.getCommonTlsContext().getCombinedValidationContextOrBuilder()
.getDefaultValidationContext().hasSystemRootCerts()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be something somewhere that is protected by GRPC_EXPERIMENTAL_XDS_SYSTEM_ROOT_CERTS. It looks like this change isn't yet supporting system_root_certs because it didn't change the validation logic in XdsClusterResource?

It is so very, very bad that the prexisting code is passing Downstream/UpstreamTlsContext this deep, and re-performing validation at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added validation logic in XdsClusterResource.

@@ -46,7 +44,7 @@ final class CertProviderClientSslContextProvider extends CertProviderSslContextP
node,
certProviders,
certInstance,
checkNotNull(rootCertInstance, "Client SSL requires rootCertInstance"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't be impacted because the change is only on the client side

Well, that's assuming we notice the issue if/when we add server support.

It looks like we should get rid of isMtls() and instead have new methods to check if keys or cert validation is configured, to be used as necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants