Skip to content

Document API

Cedrick Lunven edited this page Sep 9, 2021 · 33 revisions

Astra and Stargate bring great innovation by allowing Apache Cassandra to store Documents like a document-oriented noSQL database. To cope with Cassandra data model constraints the document shredding function has been used.

As a Java developer you want to work with objects (entities) and let the SDK interact with the API performing operations you need Create, Read, Update, Delete and search.

Initialisation

AstraClient and Stargate initializations have been detailed on the Home page. Moving forward the sample code will reuse those classes but do not initialized them.

Class ApiDocumentClient is the core class to work with document. There are multiple ways to retrieve or initialize it.

// Option1. Given an astraClient
ApiDocumentClient client1 = astraClient.apiStargateDocument();
ApiDocumentClient client2 = client.getStargateClient().apiDocument()

// Option 2. Given a StargateClient
ApiDocumentClient client3 = stargateClient.apiDocument();

// Option 3. Constructors
ApiDocumentClient client4_Astra    = new ApiDocumentClient("http://api_endpoint", "apiToken");
ApiDocumentClient client5_Stargate = new ApiDocumentClient("http://api_endpoint", 
  new TokenProviderDefault("username", "password", "http://auth_endpoint");

From now, in another samples, we will use the variable name apiDocClient as our working instance of ApiDocumentClient

Working with namespaces

This class is the main unit test for this API and could be use as reference code

✅ Lists available Namespaces Names

Stream<String> namespaces = apiDocClient.namespaceNames();

✅ Lists available Namespaces

Related endpoint documentation can be found here

Stream<Namespace> namespaces = apiDocClient.namespaces();

✅ Find namespace by its id

Related endpoint documentation can be found here

Optional<Namespace> ns1 = apiDocClient.namespace("ns1").find();

✅ Test if namespace exists

apiDocClient.namespace("ns1").exist();

✅ Create a new namespace

🚨 As of Today, the Namespace and Keyspace creations in ASTRA are available only at the DevOps API level.

// Create a namespace with a single DC dc-1
DataCenter dc1 = new DataCenter("dc-1", 1);
apiDocClient.namespace("ns1").create(dc1);

// Create a namespace providing only the replication factor
apiDocClient.namespace("ns1").createSimple(3);

✅ Delete a namespace

🚨 As of today the namespace / keyspace creations are not available in ASTRA

apiDocClient.namespace("ns1").delete();

ℹ️ Tips

You can simplify the code by assigning apiDocClient.namespace("ns1") to a NamespaceClient variable as shown below:

NamespaceClient ns1Client = astraClient.apiStargateDocument().namespace("ns1");
        
// Create if not exist
if (!ns1Client.exist()) {
  ns1Client.createSimple(3);
}
        
// Show datacenters where it lives
ns1Client.find().get().getDatacenters()
         .stream().map(DataCenter::getName)
         .forEach(System.out::println); 
        
// Delete 
ns1Client.delete();

Working with Collections

The related Api Documentation is available here

✅ Lists available Collection in namespace

// We can create a local variable to shorten the code.
NamespaceClient ns1Client = apiDocClient.namespace("ns1");
Stream<String> colNames   = ns1Client.collectionNames();

✅ Check if collection exists

CollectionClient col1Client = apiDocClient.namespace("ns1").collection("col1");
boolean colExist = col1Client.exist();

✅ Retrieve a collection from its name

Optional<CollectionDefinition> = apiDocClient.namespace("ns1").collection("col1").find();

✅ Create an empty collection

apiDocClient.namespace("ns1").collection("col1").create();

✅ Delete a collection

apiDocClient.namespace("ns1").collection("col1").delete();

ℹ️ Tips

You can simplify the code by assigning apiDocClient.namespace("ns1").collection("col1") to a variable CollectionClient

CollectionClient colClient = apiDocClient.namespace("ns1").collection("col1");
colClient.exist();
//...

Working with Documents

📘 About ApiDocument

Class ApiDocument is a wrapper to get your objects back but also their identifier. They are used for searches.

public class ApiDocument<BEAN> {
    private final String documentId;
    private final BEAN document;
}

To simplify the code in the following samples we declared a CollectionClient variable as follow:

CollectionClient colPersonClient = apiDocClient.namespace("ns1").collection("col1");

📘 About Object Mapping

Over the next samples you will notice that the SDK will marshall and unmarshall Beans to make you work with Java Objects. There is no annotation to provide whatsoever. The serialization is based on JACKON and will be done automatically.

Here the class definitions for beans used in the samples.

public class Person {
  private String firstname;
  private String lastname;
  private int age;
  private List<String> countries;
  private Address address;
  //constructor..getters..setters
}
public class Address {
  private String city;
  private int zipCode;
}

✅ Get a document by id

// doc1 is the document Id in the collection
boolean docExist = colPersonClient.document("doc1").exist();

// Find returns an optional
Optional<Person> p = colPersonClient.document("doc1").find(Person.class);

✅ Create a new document with no id

// Define an object
Person john = new Person("John", "Doe", 20, new Address("Paris", 75000));

// As no id has been provided, the API will create a UUID and returned it to you 
String docId = colPersonClient.createNewDocument(john);

✅ Upsert a document enforcing the id

// Define an object
Person john2 = new Person("John", "Doe", 20, new Address("Paris", 75000));

// Now the id is provided (myId) and we can upsert
String docId = colPersonClient.document("myId").upsert(john2, Person.class);

✅ Delete a Document from its id

colPersonClient.document("myId").delete();

✅ Count Documents

🚨 This operation can be slow. Every query to the API os paged. The method will fetch pages (limited the payloads size as much as possible) as long as they are and finally count the results.

int docNum = colPersonClient.count();

✅ Retrieve all Documents

🚨 This operation can be slow. Every query to the API os paged. The method will fetch all pages

Stream<ApiDocument<Person>> colPersonClient.findAll(Person.class);

✅ Retrieve Documents with Paging

Classes ResultPage and its specialization DocumentResultPage will be used to hold the paged results.

public class ResultPage<R> {
  private final int pageSize;
  private final String pageState;
  private final List< R > results;
}

public class DocumentResultPage< DOC > extends ResultPage<ApiDocument<DOC>> {}

Retrieving pages with no search criteria

// Retrieve the first 10 items (page 1 as no pagingState provided)
DocumentResultPage<Person> page1 = colPersonClient.findPage(Person.class, 10);

// Retrieve the next 10 items (page 2 using pageState returned for page1)
DocumentResultPage<Person> page2 = colPersonClient.findPage(Person.class, 10, page1.getPageState());

✅ Search for Documents

The document Api allows to search on any fields in the documents (!). A where clause is expected. In the rest API the parameter looks like: {"age": {"$gte":30},"lastname": {"$eq":"PersonAstra2"}} in the SDK a dedicated query and builder is provided: SearchDocumentQuery

- IMPORTANT: 
- All queries are paged by default. It is possible to list all documents and all documents matching 
- a filter with a single call but note that under the hood we are still exhausting all pages 
- so it can be slow.
// Building query {"age": {"$gte":30},"lastname": {"$eq":"PersonAstra2"}} 
SearchDocumentQuery query = SearchDocumentQuery.builder()
                    .where("age").isGreaterOrEqualsThan(30)      // First filter to use where() 
                    .and("lastname").isEqualsTo("PersonAstra2")  // Any extra filter to use and()
                    .withPageSize(10)                            // Default and max pageSize are 20
                    .build();

// Retrieve PAGE 1
DocumentResultPage<Person> currentPage = colPersonClient.findPage(query,  Person.class);

// Retrieve PAGE 2 (if any)
if (currentPage.getPageState().isPresent()) {
  query.setPageState(currentPage.getPageState().get());
}
DocumentResultPage<Person> nextPage = colPersonClient.findPage(query,  Person.class);

// Retrieve all documents in one call (with warning in RED ABOVE)
Stream<Person> allPersons = colPersonClient.findAll(query, Person.class)

✅ Find part of a documents

The document API allows to work with nested structure in document. You are asked to provide the path in the URL

http://{doc-api-endpoint}/namespaces/{namespace-id}/collections/{collection-id}/{document-id}/{document-path}

Given a Json DOCUMENT with UUID e8c5021b-2c91-4015-aec6-14a16e449818 :

{ 
  "age": 25,
  "firstname": "PersonAstra5",
  "lastname": "PersonAstra1"
  "address": {
    "city": "Paris",
    "zipCode": 75000
   },
}

You can retrieve the zipCode with: http://{doc-api-endpoint}/namespaces/ns1/collections/person/e8c5021b-2c91-4015-aec6-14a16e449818/address/zipCode

The SDK provide some utility methods to work with this as well:

// Retrieve an object and marshall
Optional<Address> address = colPersonClient
   .document("e8c5021b-2c91-4015-aec6-14a16e449818")
   .findSubDocument("address", Address.class);
        
// Retrieve a scalar deeper in the tree
Optional<Integer> zipcode = colPersonClient
  .document("e8c5021b-2c91-4015-aec6-14a16e449818")
  .findSubDocument("address/zipCode", Integer.class);

✅ Update a sub document

// Update an existing attribute of the JSON
colPersonClient.document("e8c5021b-2c91-4015-aec6-14a16e449818")
               .updateSubDocument("address", new Address("city2", 8000));

// Create a new attribute in the document
colPersonClient.document("e8c5021b-2c91-4015-aec6-14a16e449818")
               .updateSubDocument("secondAddress", new Address("city2", 8000));

✅ Delete part of a documents

colPersonClient.document("e8c5021b-2c91-4015-aec6-14a16e449818")
               .deleteSubDocument("secondAddress");

Document Repository

📘 StargateDocumentRepository overview

If you have work with Spring Data or Active Record before you might already know what the repository are. Those are classes that provides you CRUD (create, read, update, delete) operations without you having to code anything.

Here this is not different, if you provide an object for a collection this is what is available for you

public interface StargateDocumentRepository <DOC> {
   
   // Create
   String insert(DOC p);
   void insert(String docId, DOC doc);
   
   // Read unitary
   boolean exists(String docId);
   Optional<DOC> find(String docId);

   // Read records
   int count();
   DocumentResultPage<DOC> findPage();
   DocumentResultPage<DOC> findPage(SearchDocumentQuery query) ;
   Stream<ApiDocument<DOC>> findAll();
   Stream<ApiDocument<DOC>> findAll(SearchDocumentQuery query);

  // Update
  void save(String docId, DOC doc);

  // Delete
  void delete(String docId);
}

✅ Initialization of repository

// Initialization (from namespaceClients)
NamespaceClient ns1Client = astraClient.apiStargateDocument().namespace("ns1");
StargateDocumentRepository<Person> personRepository1 = new StargateDocumentRepository<Person>(ns1Client, Person.class);

Points to note:

  • No collection name is provided here. By default the SDK will use the class name in lower case (here person)
  • If you want to override the collection name you can annotate your bean Person with @Collection("my_collection_name")
// Initialization from CollectionClient, no ambiguity on collection name
CollectionClient colPersonClient = astraClient.apiStargateDocument().namespace("ns1").collection("person");
StargateDocumentRepository<Person> personRepository2 = new StargateDocumentRepository<Person>(colPersonClient, Person.class);

✅ CRUD

We assume that the repository has been initialized as describe above and name personRepo.

if (!personRepo.exists("Cedrick")) {
  personRepo.save("Cedrick", new Person("Cedrick", "Lunven", new Address()));
}

// Yeah
personRepository.findAll()                     // Stream<ApiDocument<Person>>      
                .map(ApiDocument::getDocument) // Stream<Person>      
                .map(PersonRepo::getFirstname) // Stream<String>
                .forEach(System.out::println);