Skip to content

Stargate SDK Reference Configuration

Cedrick Lunven edited this page Jan 12, 2022 · 7 revisions

This page will cover the different options and configurations available to work with Stargate SDK. To get more information regarding one API in particular please refer to the proper chapter in the menu on the left.

📋 Table of content

  1. Architecture
  2. SDK design principles
  3. Configuration StargateClient
  4. Sample Code StargateClient
  5. Load-Balancing and Failover

1. Architecture

Apache Cassandra™ is a distributed database where you can expect the clusters to have multiple datacenters (DC) and, in each one, multiple nodes. In the following sample we illustrate configurations with a 2 DC, 3 nodes each cluster as shown in the image below. The name of the nodes are the one used in the docker-compose.yml.

ℹ️ Not all gossiping links have been represented on the schema for clarity reason

A Stargate node is a similar to an Apache Cassandra™ node with the option -Dcassandra.join_ring=false as it will not store data but will join the cluster. You can use the cqlEndpoint in the same way as a Cassandra node. For the differents APIs (rest, graphql, grpc) you need a security token that you will pass through the headers. Tokens are shared across Stargate nodes with a keyspace named data_endpoint_auth. The token are valid for 1800 seconds by default. The replication factor of this keyspace should match the topology of your cluster (same as system_auth)

ℹ️ Names of nodes are used in the chapter 5 to illustrate load-balancing and provided to you in the docker-compose.yml. It should be read as follow: dc1s1 stargate node 1 of dc1, dc1n2, cassandra node 2 of dc1...

2. Stargate Client Design Principles

A Stargate node exposes multiple API(s) with multiple technologies and format (rest, graphql, grpc, cql..). Clients already in Java exist for each of theose technology.

Interface Rational
CqlSession DataStax java drivers have been developed for ages
Rest There are multiple http clients available: JDK11, RestEasy, Spring Rest Template, Apache Http Components....
GraphQL All of the http client listed above, Netflix DGS, GraphqQL Java, Spring GraphQL
gRPC The core grpc client, the Stargate gRPC client...

There is nothing wrong picking one of those to work with Stargate based on your needs. So why a SDK ? The purpose is to simplify configuration and usage. It also brings some additional features like Object Mapping, findAll, load balancing, failover, monitoring...

📘 Design principles

  1. Users need a single class to setup the SDK StargateClientConfig with multiple ways to load the configuration (builders, yaml) and extension points to create your own.

  2. Users need a single class to use the SDK StargateClient.

StargateClient sdk = StargateClient.builder().build();
ApiDocumentClient apiDocument    = sdk.apiDocument();
ApiDataClient apiDataRest        = sdk.apiRest();
ApiGrpcClient apiGrpc.           = sdk.apiGrpc();
ApiGraphQLClient apiGraphClient  = sdk.apiGraphQL();         
  1. Keep it Simple Stupid (KISS). A fluent Api is a great way to guide developers in the usage of the different functions available. As an example, try to find what this code is doing without any other further documentation:
Stream <Document<String>> familyDoe = sdk
  .apiDocument()           // Use Document API
  .namespace("foo")        // Work in namespace (keyspace) 'foo'
  .collection("bar")       // Work with collection (table) 'bar'
  .findAll(SearchDocumentQuery.builder()
     .select("firstname")
     .where("lastname").isEqualsTo("Doe")
     .build(), String.class);
  1. Do not reinvent the wheel Reuse existing client and integrate them.
  • For CqlSession: Wrap DataStax java driver without hiding any possible customizations
  • For Rest: We pick Apache Http Components to limit dependencies to third-party and backward compatibility with JDK8.
  • For GraphQL: We pick Apache Http Components to limit dependencies to third-party and backward compatibility with JDK8. Also the dynamic nature of graphQL endpoint does not schema introspection.
  • For Grpc: Wrap the Stargate gRPC client provided by the Stargate team itself.
  1. Initialize only clients that can be used with no errors message for others:
Initializing [StargateClient]
+ Stargate nodes #[1] in [DC1]
+ CqlSession   :[ENABLED] with keyspace [quickstart]
+ API Data     :[ENABLED]
+ API Document :[ENABLED]
+ API GraphQL  :[ENABLED]
+ API Grpc     :[ENABLED]

3. Configuration StargateClient

In the following chapter we will describe the different parameters available for the Builder.

StargateClientConfig config = StargateClient.builder();

📘 LocalDatacenter

This parameter is always required. Define what should the current datacenter. Since driver 4x the parameter is required for the cql native drivers and it will also desing on which set of stargate nodes to start with for the different apis.

config.withLocalDatacenter("<datacenter_name>")

📘 ApiNode: Stargate Node for local datacenter

*Define a stargate node for the stateless APIs. It will be part of local datacenter *

config.withApiNode(
  new StargateNodeConfig("name", "host", port_auth, 
      port_graphql, port_rest, port_grpc))

📘 ApiNodeDC: Stargate Node for specified Datacenter

Define a stargate node for the stateless APIs. It will be part of datacenter DCname

config.withApiNodeDC("DCname", 
   new StargateNodeConfig("name", "host", port_auth, 
      port_graphql, port_rest, port_grpc))

📘 ApplicationName

Populate application name field in CqlSession.

config.withApplicationName("appp")

📘 Credentials

This parameter is always required. It will be used to create a token from the authentication endpoints and used to sign for Cqlsession

config.withAuthCredentials("username", "password")

📘 ApiToken: Static token

If provided, the token will override any token provider. This is how Astra token are passed to the component.

config.withApiToken("token")

📘 ApiTokenProvider: Token provider for current datacenter

A token provider will try to generate a token against some authentication url. You can specify dedicated token provider to avoid stargate nodes to be used for the authentication. As no datacenter provided it will use the current datacenter.

config.withApiTokenProvider("url1", "url2")

📘 ApiTokenProviderDC: Token provider for specified datacenter

A token provider will try to generate a token against some authentication url. You can specify dedicated token provider to avoid stargate nodes to be used for the authentication. As no datacenter provided it will the specified datacenter.

config.withApiTokenProviderDC("<datacenter_name>", "url1", "url2")

📘 CloudSecureBundle: SCB for current Astra region

Cloud Secure Bundle File used with Astra deployment. The localDatacenter should match the region name.

config.withCqlCloudSecureConnectBundle("/tmp/scb")

📘 CloudSecureBundleDC: SCB for target Astra region

*Cloud Secure Bundle File used with Astra deployment. The DCname should match the target astra region.

config.withCqlCloudSecureConnectBundleDC("<datacenter_name>", "/tmp/scb")

📘 ConsistencyLevel: for current datacenter

Override the default consistency level (LOCAL_QUORUM). The consistency level can be changed at each request.

config.withCqlConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM)

📘 Consistency Level: for target Datacenter

Override the default consistency level (LOCAL_QUORUM) for a specified datacenter. The consistency level can be changed at each request.

config.withCqlConsistencyLevelDC("<datacenter_name>", ConsistencyLevel.LOCAL_QUORUM)

📘 CqlContactPoints: For current datacenter

Needed to initiate connection to Cassandra nodes on current datacenter, one is enough, 2 is recommended if first is not available. (try to spot the seeds if possible)

config.withCqlContactPoints("localhost:9052")

📘 CqlContactPointsDC: For target datacenter

Needed to initiate connection to Cassandra nodes on targate datacenter, one is enough, 2 is recommended if first is not available. (try to spot the seeds if possible)

config.withCqlContactPointsDC("<datacenter_name>", "localhost:9062")

📘 DriverConfigurationFile: load configuration for driver file

If application.conf is part of your classpath it would be loaded but you may want to provide your own external file.

config.withCqlDriverConfigurationFile(null)

📘 DriverConfigurationLoader: load configuration for driver programmatically

Allow configuration through programmatic configuration loader

config.withCqlDriverConfigurationLoader(null)

📘 CqlDriverOption: Driver Cql Option for local datacenter

*Each property in application.conf is identified by a key. Sometimes you want those keys to be set programmatically. Here the OptionsMap will be updated with this property.

config.withCqlDriverOption(null, null)

📘 CqlDriverOptionDC: Driver Cql Option for target datacenter

*Each property in application.conf is identified by a key. Sometimes you want those keys to be set programmatically. Here the OptionsMap will be updated with this property. This key will be defined in an execution profile of name <datacenter_name>. Each DC will have its own execution profile.

config.withCqlDriverOptionDC("<datacenter_name>", null, null)

📘 CqlKeyspace: working Keyspace

Set current keyspace in the cqlSession.

config.withCqlKeyspace(null)

📘 CqlMetricsRegistry: Cql Metrics Registry

Setup metrics (Dropwizard) at Cql driver level.

config.withCqlMetricsRegistry(null)

📘 CqlRequestTracker: Cql drivers request tracker

Setup request tracker at Cql driver level.

config.withCqlRequestTracker(null)

📘 CqlSessionBuilderCustomizer Cql Session builder customizer

*Allow programmatic external configuration before creating CqlSession

config.withCqlSessionBuilderCustomizer(null)

📘 HttpRequestConfig: Http request configurations

*Fine tuning HTTP requests with ClientConfiguration (timeouts, keepalive, pooling...). Default are 20 seconds for timeouts.

config.withHttpRequestConfig(RequestConfig.custom()
   .setCookieSpec(StandardCookieSpec.STRICT)
   .setExpectContinueEnabled(true)
   .setConnectionRequestTimeout(Timeout.ofSeconds(5))
   .setConnectTimeout(Timeout.ofSeconds(5))
   .setTargetPreferredAuthSchemes(Arrays.asList(StandardAuthScheme.NTLM, StandardAuthScheme.DIGEST))
   .build())

📘 HttpRetryConfig: Http retry configurations

*Fine tuning for Http retries. But default a call is retried 3 times before the resources is considered unavailable by the internal load balancer. The retries are done in exponential backoff, after 100 millis, 200 millis and 400 millis.

config.withHttpRetryConfig(new RetryConfigBuilder()
   .retryOnAnyException()
   .withDelayBetweenTries(Duration.ofMillis(100))
   .withExponentialBackoff()
   .withMaxNumberOfTries(3)
   .build();)

📘 HttpObservers: Monitoring Observers

Define your observer and get notified for each request and retry. The class should implement ApiInvocationObserver.

public class SampleHttpObserver implements ApiInvocationObserver {

    /** {@inheritDoc} */
    @Override
    public void onCall(ApiInvocationEvent event) {
        System.out.println(event.getHost());
    }
    /** {@inheritDoc} */
    @Override
    public void onHttpSuccess(Status<String> s) {}

    /** {@inheritDoc} */
    @Override
    public void onHttpCompletion(Status<String> s) {}

    /** {@inheritDoc} */
    @Override
    public void onHttpFailure(Status<String> s) {}

    /** {@inheritDoc} */
    @Override
    public void onHttpFailedTry(Status<String> s) {}
}

You can register an observer by its name

config.addHttpObserver("sample", new SampleHttpObserver());

You can register all observer by providing a Map

Map<String, ApiInvocationObserver> observers = new HashMap<>();
observers.put("sample", new SampleHttpObserver());
observers.put("log", new AnsiLoggerObserver());
config.withHttpObservers(observers)

2 observers are provided out of the box AnsiLoggerObserver and AnsiLoggerObserverLight.

📘 withoutCqlSession: Disable Cql Session

To enforce a stateless client with no usage of the cqlSession you can disable CQL with this flag

config.withoutCqlSession()

4. Sample Code StargateClient

ℹ️ Explanation for each line of the following code (and more) have been detailed in previous chapter 3. Reference Configuration

public StargateClient setupStargate() {
 return StargateClient.builder()
  .withApplicationName("FullSample")
    
  // Setup DC1
  .withLocalDatacenter("DC1")
  .withAuthCredentials("cassandra", "cassandra")
  .withCqlContactPoints("localhost:9052")
  .withCqlKeyspace("system")
  .withCqlConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM)
  .withCqlDriverOption(TypedDriverOption.CONNECTION_CONNECT_TIMEOUT, Duration.ofSeconds(10))
  .withCqlDriverOption(TypedDriverOption.CONNECTION_INIT_QUERY_TIMEOUT, Duration.ofSeconds(10))
  .withCqlDriverOption(TypedDriverOption.CONNECTION_SET_KEYSPACE_TIMEOUT, Duration.ofSeconds(10))
  .withCqlDriverOption(TypedDriverOption.CONTROL_CONNECTION_TIMEOUT, Duration.ofSeconds(10))
  .withApiNode(new StargateNodeConfig("dc1s1", "localhost", 8081, 8082, 8080, 8083))
  .withApiNode(new StargateNodeConfig("dc1s2", "localhost", 9091, 9092, 9090, 9093))
                
  // Setup DC2
  .withApiNodeDC("DC2", new StargateNodeConfig("dc2s1", "localhost", 6061, 6062, 6060, 6063))
  .withApiNodeDC("DC2", new StargateNodeConfig("dc2s2", "localhost", 7071, 7072, 7070, 7073))
  .withCqlContactPointsDC("DC2", "localhost:9062")
  .withCqlDriverOptionDC("DC2",TypedDriverOption.CONNECTION_CONNECT_TIMEOUT, Duration.ofSeconds(10))
  .withCqlDriverOptionDC("DC2",TypedDriverOption.CONNECTION_INIT_QUERY_TIMEOUT, Duration.ofSeconds(10))
  .withCqlDriverOptionDC("DC2",TypedDriverOption.CONNECTION_SET_KEYSPACE_TIMEOUT, Duration.ofSeconds(10))
  .withCqlDriverOptionDC("DC2",TypedDriverOption.CONTROL_CONNECTION_TIMEOUT, Duration.ofSeconds(10))
                
  // Setup HTTP
  .withHttpRequestConfig(RequestConfig.custom()
    .setCookieSpec(StandardCookieSpec.STRICT)
    .setExpectContinueEnabled(true)
    .setConnectionRequestTimeout(Timeout.ofSeconds(5))
    .setConnectTimeout(Timeout.ofSeconds(5))
    .setTargetPreferredAuthSchemes(Arrays.asList(StandardAuthScheme.NTLM, StandardAuthScheme.DIGEST))
    .build())
  .withHttpRetryConfig(new RetryConfigBuilder()
    .retryOnAnyException()
    .withDelayBetweenTries( Duration.ofMillis(100))
    .withExponentialBackoff()
    .withMaxNumberOfTries(10)
    .build())
  .addHttpObserver("logger_light", new AnsiLoggerObserver())
  .addHttpObserver("logger_full", new AnsiLoggerObserverLight())
  .build();
}

5. Load-Balancing and Failover

📘 Recommended Architectures

  • It is recommended to delegate load-balancing and failover cross Stargate nodes to infrastructure components. When Stargate is deployed in K8ssandra or AstraDB load-balancing is done for your through Kubernetes Services (ingreess)

  • It is recommended to perform failover across regions/datacenters at application level with DNS based routing. Indeed if the failover is implemented at data layer we can see some spikes in the latencies (from 25ms to 150 ms in our tests on Astra).

📘 Client-Side Fail over and load balancing

Still, the failover and load balancing can be implemented at client side within the SDK and this section describe how to do it. We will use the following architecture

ℹ️ Not all gossiping links have been represented on the schema for clarity reason. Every node has a cql Endpoint, still only dcn1 and dc1n2 have been mapped to the host. Endpoint colors represent the format of the interfaces.

StargateClient will perform:

  1. X retries for the http request(s) if failure is detected
  2. Distribute the requests among STARGATE NODES OF THE CURRENT DATACENTER. The default load balancing policy is round-robin but it can be updated (weight, random).
  3. Failover across Stargate nodes of the same datacenter
  4. Detect Stargate nodes coming back online in the same datacenter
  5. Failover across Datacenter if all nodes of current datacenter are down

In the following walkthrough we will illustrate how the load-balancing and failover works.

5a. Start DC1

⚠️ Prerequisites: Docker, Java, Maven and an IDE installed as detailed in Quickstart + Allocate 12 GB of memory to Docker (we will launch 10 containers)

✅ Start nodes in DC1 dc1n1, dc1n2, dc1n3

The seed node dc1n1 will start first, dc1n2 will wait for 30s and dc1n3 for 80s for each node to bootstrap one after the other.

docker-compose up -d dc1n1 dc1n2 dc1n3

🖥️ Expected output

Creating sdk-stargate-full_dc1n1_1 ... done
Creating sdk-stargate-full_dc1n2_1 ... done
Creating sdk-stargate-full_dc1n3_1 ... done

✅ Control your containers

All containers will start right away but the nodes will follow the boostrapping sequence.

docker-compose ps

🖥️ Expected output

          Name                         Command               State                                        Ports                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
sdk-stargate-full_dc1n1_1   docker-entrypoint.sh cassa ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 0.0.0.0:9052->9042/tcp,:::9052->9042/tcp, 9160/tcp
sdk-stargate-full_dc1n2_1   docker-entrypoint.sh /bin/ ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp                                
sdk-stargate-full_dc1n3_1   docker-entrypoint.sh /bin/ ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp     

✅ Control your Cassandra Nodes

export dc1n1=`docker ps | grep dc1n1 | cut -b 1-12`
docker exec -it $dc1n1 nodetool status

🖥️ Expected output: Nodes are up (U) and normal (N), datacenter is DC1.

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens       Owns (effective)  Host ID                               Rack
UN  172.21.0.4  177.81 KiB  256          34.8%             976ad0fb-2d55-4b2c-b5d8-17a9e8ddc7f3  rack1
UN  172.21.0.3  187.74 KiB  256          33.0%             ae778688-b146-43db-a1e1-44a689eea042  rack1
UN  172.21.0.2  161.77 KiB  256          30.1%             d1d702ca-efd7-4f09-a192-4c5283cd20d1  rack1

✅ Alter keyspace system_auth

export dc1n1_ip=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' $dc1n1)
docker exec -it $dc1n1 \
 cqlsh $dc1n1_ip \
   -u cassandra \
   -p cassandra \
   -e "ALTER KEYSPACE system_auth WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'DC1' : 3};"

✅ Create keyspace data_endpoint_auth

ℹ️ This keyspace is used by Stargate to store the authentication tokens.

docker exec -it $dc1n1 \
 cqlsh $dc1n1_ip \
   -u cassandra \
   -p cassandra \
   -e "DROP KEYSPACE IF EXISTS data_endpoint_auth;CREATE KEYSPACE data_endpoint_auth WITH replication = {'class':'NetworkTopologyStrategy', 'DC1': '3'} AND durable_writes = true;DESCRIBE KEYSPACE data_endpoint_auth;"

5b. Start DC2 (optional)

✅ Start seed and 2 nodes dc2n1, dc2n2, dc2n3

docker-compose up -d dc2n1 dc2n2 dc2n3

🖥️ Expected output

Creating sdk-stargate-full_dc2n1_1 ... done
Creating sdk-stargate-full_dc2n2_1 ... done
Creating sdk-stargate-full_dc2n3_1 ... done

✅ Control your containers

docker-compose ps

🖥️ Expected output

          Name                         Command               State                                        Ports                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
sdk-stargate-full_dc1n1_1   docker-entrypoint.sh cassa ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 0.0.0.0:9052->9042/tcp,:::9052->9042/tcp, 9160/tcp
sdk-stargate-full_dc1n2_1   docker-entrypoint.sh /bin/ ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp                                
sdk-stargate-full_dc1n3_1   docker-entrypoint.sh /bin/ ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp                                
sdk-stargate-full_dc2n1_1   docker-entrypoint.sh cassa ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 0.0.0.0:9062->9042/tcp,:::9062->9042/tcp, 9160/tcp
sdk-stargate-full_dc2n2_1   docker-entrypoint.sh /bin/ ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp                                
sdk-stargate-full_dc2n3_1   docker-entrypoint.sh /bin/ ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp    

✅ Control your Cassandra Nodes

docker exec -it $dc1n1 nodetool status

🖥️ Expected output : nodes up (U) and normal (N)*

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens       Owns (effective)  Host ID                               Rack
UN  172.21.0.4  177.81 KiB  256          34.8%             976ad0fb-2d55-4b2c-b5d8-17a9e8ddc7f3  rack1
UN  172.21.0.3  187.74 KiB  256          33.0%             ae778688-b146-43db-a1e1-44a689eea042  rack1
UN  172.21.0.2  161.77 KiB  256          30.1%             d1d702ca-efd7-4f09-a192-4c5283cd20d1  rack1
Datacenter: DC2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens       Owns (effective)  Host ID                               Rack
UN  172.21.0.5  157.31 KiB  256          33.7%             4957b5ff-4cac-4d90-b4f1-dfb1404c4908  rack1
UN  172.21.0.7  183.05 KiB  256          34.1%             f942486f-3cea-4c69-aa10-46eb430012de  rack1
UN  172.21.0.6  182.55 KiB  256          34.3%             2856108c-ac76-404e-ad6d-8818810c5b12  rack1

✅ Alter keyspace system_auth

export dc2n1=`docker ps | grep dc2n1 | cut -b 1-12`
export dc2n1_ip=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' $dc2n1)

docker exec -it $dc2n1 \
 cqlsh $dc2n1_ip \
   -u cassandra \
   -p cassandra \
   -e "ALTER KEYSPACE system_auth WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'DC1' : 3, 'DC2' : 3};"

✅ Alter keyspace data_endpoint_auth

docker exec -it $dc2n1 \
 cqlsh $dc2n1_ip \
   -u cassandra \
   -p cassandra \
   -e "ALTER KEYSPACE data_endpoint_auth WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'DC1' : 3, 'DC2' : 3};"

5c. Start Stargate Nodes

✅ Start nodes for dc1

docker-compose up -d dc1s1 dc1s2

✅ Start nodes for dc2

docker-compose up -d dc2s1 dc2s2

✅ Control Cluster

docker exec -it $dc2n1 \
 cqlsh $dc2n1_ip \
   -u cassandra \
   -p cassandra \
   -e "SELECT data_center from system.local"
 data_center
-------------
         DC2
(1 rows)
docker-compose ps

🖥️ Expected output

          Name                         Command               State                                                      Ports                                                   
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
sdk-stargate-full_dc1n1_1   docker-entrypoint.sh cassa ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 0.0.0.0:9052->9042/tcp,:::9052->9042/tcp, 9160/tcp                           
sdk-stargate-full_dc1n2_1   docker-entrypoint.sh /bin/ ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp                                                           
sdk-stargate-full_dc1n3_1   docker-entrypoint.sh /bin/ ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp                                                           
sdk-stargate-full_dc1s1_1   ./starctl                        Up      0.0.0.0:8080->8080/tcp,:::8080->8080/tcp, 0.0.0.0:8081->8081/tcp,:::8081->8081/tcp,                        
                                                                     0.0.0.0:8082->8082/tcp,:::8082->8082/tcp, 0.0.0.0:8083->8090/tcp,:::8083->8090/tcp,                        
                                                                     0.0.0.0:9053->9042/tcp,:::9053->9042/tcp                                                                   
sdk-stargate-full_dc1s2_1   ./starctl                        Up      0.0.0.0:9090->8080/tcp,:::9090->8080/tcp, 0.0.0.0:9091->8081/tcp,:::9091->8081/tcp,                        
                                                                     0.0.0.0:9092->8082/tcp,:::9092->8082/tcp, 0.0.0.0:9093->8090/tcp,:::9093->8090/tcp,                        
                                                                     0.0.0.0:9054->9042/tcp,:::9054->9042/tcp                                                                   
sdk-stargate-full_dc2n1_1   docker-entrypoint.sh cassa ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 0.0.0.0:9062->9042/tcp,:::9062->9042/tcp, 9160/tcp                           
sdk-stargate-full_dc2n2_1   docker-entrypoint.sh /bin/ ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp                                                           
sdk-stargate-full_dc2n3_1   docker-entrypoint.sh /bin/ ...   Up      7000/tcp, 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp                                                           
sdk-stargate-full_dc2s1_1   ./starctl                        Up      0.0.0.0:6060->8080/tcp,:::6060->8080/tcp, 0.0.0.0:6061->8081/tcp,:::6061->8081/tcp,                        
                                                                     0.0.0.0:6062->8082/tcp,:::6062->8082/tcp, 0.0.0.0:6063->8090/tcp,:::6063->8090/tcp,                        
                                                                     0.0.0.0:9063->9042/tcp,:::9063->9042/tcp                                                                   
sdk-stargate-full_dc2s2_1   ./starctl                        Up      0.0.0.0:7070->8080/tcp,:::7070->8080/tcp, 0.0.0.0:7071->8081/tcp,:::7071->8081/tcp,                        
                                                                     0.0.0.0:7072->8082/tcp,:::7072->8082/tcp, 0.0.0.0:7073->8090/tcp,:::7073->8090/tcp,                        
                                                                     0.0.0.0:9064->9042/tcp,:::9064->9042/tcp 

5d. Configuration

Let us create a code that will loop forever interacting with each API and see how the project behave on errors

5e. Experimentations

✅ Start

Expecting LB in dc1s1 and dc1s2 (9090,9092 and 8080,8082)

DataCenter Name (cql) : DC1
Http (201) on http://localhost:9091/v1/auth
Http (200) on http://localhost:8082/v2/schemas/keyspaces
Http (200) on http://localhost:8082/v2/schemas/namespaces
---
DataCenter Name (cql) : DC1
Http (200) on http://localhost:9090/graphql-schema
Http (200) on http://localhost:8082/v2/schemas/keyspaces
Http (200) on http://localhost:9092/v2/schemas/namespaces
---
DataCenter Name (cql) : DC1
Http (200) on http://localhost:8080/graphql-schema
Http (200) on http://localhost:9092/v2/schemas/keyspaces
Http (200) on http://localhost:8082/v2/schemas/namespaces
---

✅ Kill stargate node dc1s2

docker-compose stop dc1s2

Detect failure, balance load, expecting dc1s1 still up and running (8080)

---
Http (200) on http://localhost:8080/graphql-schema
DataCenter Name (cql) : DC1
org.apache.hc.client5.http.impl.classic.HttpRequestRetryExec : Recoverable I/O exception (org.apache.hc.core5.http.NoHttpResponseException) caught when processing request to {}->http://localhost:9092
com.datastax.stargate.sdk.utils.HttpApisClient : Failure on attempt 1/3 
com.datastax.stargate.sdk.utils.HttpApisClient : Failure on attempt 2/3 
com.datastax.stargate.sdk.utils.HttpApisClient [0;39m : Failure on attempt 3/3 
com.datastax.stargate.sdk.utils.HttpApisClient [0;39m : Calls failed after 3 retries
com.datastax.stargate.sdk.utils.HttpApisClient [0;39m : Error for request [a9fa5eea-9a07-4846-9c94-192b47bceddb], url=http://localhost:9092/v2/schemas/keyspaces, method=GET, code=503, body=Response is empty, cannot contact endpoint, please check url
Http (503) on http://localhost:9092/v2/schemas/keyspaces
com.datastax.stargate.sdk.StargateClient : A stargate node is down [dc1s2], falling back to another node...
Disabling... dc1s2
com.datastax.stargate.sdk.loadbalancer.Loadbalancer: Resources status after weight computation:
com.datastax.stargate.sdk.loadbalancer.Loadbalancer:  + dc1s2: 0.0
com.datastax.stargate.sdk.loadbalancer.Loadbalancer:  + dc1s1: 100.0
Http (200) on http://localhost:8082/v2/schemas/keyspaces
Http (200) on http://localhost:8082/v2/schemas/namespaces
---
Http (200) on http://localhost:8080/graphql-schema
DataCenter Name (cql) : DC1
Http (200) on http://localhost:8082/v2/schemas/keyspaces
Http (200) on http://localhost:8082/v2/schemas/namespaces
---

✅ Restart the stargate node in DC1

docker-compose up -d

Expecting dc1s2 detection and LB in dc1s1 and dc1s2.: Please check the logs as dc1s2 can take while to be up, in between you can get 404 (node is not yet detected) and then 503 (node is starting). In the log the attempt #3. Now before this every 10s it tried and failed.

[..] Using only node dc1s1
Http (200) on http://localhost:8080/graphql-schema
DataCenter Name (cql) : DC1
com.datastax.stargate.sdk.loadbalancer.Loadbalancer: dc1s2 has reached ends of its unavailability period, putting it back in the pool
com.datastax.stargate.sdk.loadbalancer.Loadbalancer : Resources status after weight computation:
com.datastax.stargate.sdk.loadbalancer.Loadbalancer :  + dc1s2: 50.0
com.datastax.stargate.sdk.loadbalancer.Loadbalancer:  + dc1s1: 50.0
org.apache.hc.client5.http.impl.classic.HttpRequestRetryExec[0;39m : Recoverable I/O exception (org.apache.hc.core5.http.NoHttpResponseException) caught when processing request to {}->http://localhost:9092
com.datastax.stargate.sdk.utils.HttpApisClient : Failure on attempt 1/3 
com.datastax.stargate.sdk.utils.HttpApisClient : Failed request GET on http://localhost:9092/v2/schemas/keyspaces/v2/schemas/keyspaces
org.apache.hc.client5.http.impl.classic.HttpRequestRetryExec[0;39m : Recoverable I/O exception (org.apache.hc.core5.http.NoHttpResponseException) caught when processing request to {}->http://localhost:9092
com.datastax.stargate.sdk.utils.HttpApisClient : Failure on attempt 2/3 
com.datastax.stargate.sdk.utils.HttpApisClient : Failed request GET on http://localhost:9092/v2/schemas/keyspaces/v2/schemas/keyspaces
Http (200) on http://localhost:9092/v2/schemas/keyspaces
Http (200) on http://localhost:9092/v2/schemas/namespaces
---
Http (200) on http://localhost:8080/graphql-schema
DataCenter Name (cql) : DC1
Http (200) on http://localhost:9092/v2/schemas/keyspaces
Http (200) on http://localhost:8082/v2/schemas/namespaces

✅ Kill both stargate node in DC1

As before if one node is down falling back to node 2, when both are unavailable fail over cross DC..

DataCenter Name (cql) : DC1
Http (200) on http://localhost:9090/graphql-schema
Http (200) on http://localhost:9092/v2/schemas/keyspaces
Http (200) on http://localhost:9092/v2/schemas/namespaces
---
Http (200) on http://localhost:9090/graphql-schema
DataCenter Name (cql) : DC1
org.apache.hc.client5.http.impl.classic.HttpRequestRetryExec: Recoverable I/O exception (org.apache.hc.core5.http.NoHttpResponseException) caught when processing request to {}->http://localhost:9092
com.datastax.stargate.sdk.utils.HttpApisClient : Failure on attempt 1/3 
com.datastax.stargate.sdk.utils.HttpApisClient : Failed request GET on http://localhost:9092/v2/schemas/keyspaces/v2/schemas/keyspaces
com.datastax.stargate.sdk.utils.HttpApisClient : Failure on attempt 2/3 
com.datastax.stargate.sdk.utils.HttpApisClient : Failed request GET on http://localhost:9092/v2/schemas/keyspaces/v2/schemas/keyspaces
com.datastax.stargate.sdk.utils.HttpApisClient : Failure on attempt 3/3 
com.datastax.stargate.sdk.utils.HttpApisClient : Failed request GET on http://localhost:9092/v2/schemas/keyspaces/v2/schemas/keyspaces
com.datastax.stargate.sdk.utils.HttpApisClient : Calls failed after 3 retries
com.datastax.stargate.sdk.utils.HttpApisClient : Error for request [83e5afe0-6330-4502-8eb7-03b70bb3a0c1], url=http://localhost:9092/v2/schemas/keyspaces, method=GET, code=503, body=Response is empty, cannot contact endpoint, please check url
com.datastax.stargate.sdk.StargateClient : A stargate node is down [dc1s2], falling back to another node...
Http (503) on http://localhost:9092/v2/schemas/keyspaces
com.datastax.stargate.sdk.loadbalancer.Loadbalancer[0;39m : Resources status after weight computation:
com.datastax.stargate.sdk.loadbalancer.Loadbalancer[0;39m :  + dc1s1: 0.0
com.datastax.stargate.sdk.loadbalancer.Loadbalancer[0;39m :  + dc1s2: 0.0
com.datastax.stargate.sdk.StargateClient : No node available is localDc [DC1], falling back to another DC if available ...
com.datastax.stargate.sdk.StargateClient : Using DataCenter [DC2]
com.datastax.stargate.sdk.StargateClient : + CqlSession   :[ ENABLED] with keyspace [system] and dc [DC2 ]
com.datastax.stargate.sdk.StargateClient : Failover from DC1 to DC2
Http (201) on http://localhost:7071/v1/auth
Http (200) on http://localhost:6062/v2/schemas/keyspaces
Http (200) on http://localhost:6062/v2/schemas/namespaces
---
Http (200) on http://localhost:7070/graphql-schema
DataCenter Name (cql) : DC2
Http (200) on http://localhost:6062/v2/schemas/keyspaces
Http (200) on http://localhost:7072/v2/schemas/namespaces

ℹ️ Code of this demonstration is available for download 📥 here.