Skip to content

Document API

Cedrick Lunven edited this page Jan 6, 2022 · 33 revisions

Astra and Stargate bring great innovation by allowing Apache Cassandra to store Documents like a document-oriented noSQL database. To cope with Cassandra data model constraints the document shredding function has been used.

As a Java developer you want to work with objects (entities) and let the SDK interact with the API performing operations you need Create, Read, Update, Delete and search.

ApiDocumentClient Initialization

AstraClient and Stargate initializations have been detailed on the Home page. Moving forward the sample code will reuse those classes but do not initialized them.

Class ApiDocumentClient is the core class to work with document. There are multiple ways to retrieve or initialize it.

// Option1. Given an astraClient
ApiDocumentClient client1 = astraClient.apiStargateDocument();
ApiDocumentClient client2 = client.getStargateClient().apiDocument()

// Option 2. Given a StargateClient
ApiDocumentClient client3 = stargateClient.apiDocument();

// Option 3. Constructors
ApiDocumentClient client4_Astra    = new ApiDocumentClient("http://api_endpoint", "apiToken");
ApiDocumentClient client5_Stargate = new ApiDocumentClient("http://api_endpoint", 
  new TokenProviderDefault("username", "password", "http://auth_endpoint");

From now, in another samples, we will use the variable name apiDocClient as our working instance of ApiDocumentClient

Working with namespaces

This class is the main unit test for this API and could be use as reference code

✅. Lists available Namespaces Names

Stream<String> namespaces = apiDocClient.namespaceNames();

✅. Lists available Namespaces

Related endpoint documentation can be found here

Stream<Namespace> namespaces = apiDocClient.namespaces();

✅. Find namespace by its id

Related endpoint documentation can be found here

Optional<Namespace> ns1 = apiDocClient.namespace("ns1").find();

✅. Test if namespace exists

apiDocClient.namespace("ns1").exist();

✅. Create a new namespace

🚨 As of Today, the Namespace and Keyspace creations in ASTRA are available only at the DevOps API level.

// Create a namespace with a single DC dc-1
DataCenter dc1 = new DataCenter("dc-1", 1);
apiDocClient.namespace("ns1").create(dc1);

// Create a namespace providing only the replication factor
apiDocClient.namespace("ns1").createSimple(3);

✅. Delete a namespace

🚨 As of today the namespace / keyspace creations are not available in ASTRA

apiDocClient.namespace("ns1").delete();

ℹ️ Tips

You can simplify the code by assigning apiDocClient.namespace("ns1") to a NamespaceClient variable as shown below:

NamespaceClient ns1Client = astraClient.apiStargateDocument().namespace("ns1");
        
// Create if not exist
if (!ns1Client.exist()) {
  ns1Client.createSimple(3);
}
        
// Show datacenters where it lives
ns1Client.find().get().getDatacenters()
         .stream().map(DataCenter::getName)
         .forEach(System.out::println); 
        
// Delete 
ns1Client.delete();

Working with Collections

The related Api Documentation is available here

✅. Lists available Collection in namespace

// We can create a local variable to shorten the code.
NamespaceClient ns1Client = apiDocClient.namespace("ns1");
Stream<String> colNames   = ns1Client.collectionNames();

✅. Check if collection exists

CollectionClient col1Client = apiDocClient.namespace("ns1").collection("col1");
boolean colExist = col1Client.exist();

✅. Retrieve a collection from its name

Optional<CollectionDefinition> = apiDocClient.namespace("ns1").collection("col1").find();

✅. Create an empty collection

apiDocClient.namespace("ns1").collection("col1").create();

✅. Delete a collection

apiDocClient.namespace("ns1").collection("col1").delete();

ℹ️ Tips

You can simplify the code by assigning apiDocClient.namespace("ns1").collection("col1") to a variable CollectionClient

CollectionClient colClient = apiDocClient.namespace("ns1").collection("col1");
colClient.exist();
//...

In the following we consider than CollectionClient, colClient, has been initialized.

Working with Documents

📘. About Document

In Stargate Document API, documents are retrieved with a Json payload and unique identifier (UUID).

{
  [...]
  "data": {
    "9e14db1c-0a05-47d2-9f27-df881f7f37ab": { "p1": "v1", "p2": "v2"},
    "9e14db1c-0a05-47d2-9f27-df881f7f37ac": { "p1": "v11", "p2": "v21"},
    "9e14db1c-0a05-47d2-9f27-df881f7f37ad": { "p1": "v12", "p2": "v22"}
  }
}

This SDK provides the class Document as a wrapper to give you both documentId (unique identifier) and document (payload).

public class Document<T> {
  private String documentId;
  private T document;
  // Constructor, Getters, Setters
}

📘. Object Mapping

Documents Payload can be deserialized as beans or let unchanged as Json. To build the expected beans you can either leverage on Jackson or implement your customer DocumentMapper. We will illustrate this in the sample codes.

// Retrieve data as JSON
Page<Document<String>> pageOfJsonRecords = cp.findPage();

// Retrieve data with default Jackson Mapper
Page<Document<Person>> pageOfPersonRecords1 = cp.findPage(Person.class);

// Retrieve data with a custom Mapper
Page<Document<Person>> pageOfPersonRecords2 = cp.findPage(new DocumentMapper<Person>() {
  public Person map(String record) {
     return new Person();
  }
});

✅. Search Documents in a collection (paged = recommended)

The document Api allows to search on any fields in the document. A where clause is expected. In the rest API the parameter looks like: {"age": {"$gte":30},"lastname": {"$eq":"PersonAstra2"}} in the SDK dedicated queries and builders are provided for full scan or paged searches Query and PageableQuery respectively.

  • Build your query with PageableQuery
PageableQuery query = PageableQuery.builder()
  .selectAll()
  .where("firstName").isEqualsTo("John")
  .and("lastName").isEqualsTo("Connor")
  .pageSize(3)
  //.pageState() used to get the next pages
  .build();
  • Retrieve your Page<T> with findPage(PageableQuery query), if you do provide any marshaller you get a Json String.
Page<Document<String>> page1 = cp.findPage(query);

// Use pagingState in page1 to retrieve page2
if (page1.getPageState().isPresent()) {
  query.setPageState(page1.getPageState().get());
  Page<Document<String>> page2 = cp.findPage(query);
}
  • Retrieve your Page<T> with findPage(PageableQuery query, Class<T> class) using default Jackson Mapper
Page<Document<Person>> page1 = cp.findPage(query, Person.class);

// Use pagingState in page1 to retrieve page2
if (page1.getPageState().isPresent()) {
  query.setPageState(page1.getPageState().get());
  Page<Document<Person>> page2 = cp.findPage(query, Person.class);
}
  • Retrieve your Page<T> with findPage(PageableQuery query, DocumentMapper<T>) using your custom mapping
public static class PersonMapper implements DocumentMapper<Person> {
  @Override
  public Person map(String record) {
    Person p = new Person();
    // custom logic
    return p;
  }    
}

Page<Document<Person>> page1 = cp.findPage(query, new PersonMapper());

✅. Search Documents in a collection (not paged = can be slow)

🚨 Those operations can be slow. Every query to the Document API is paged. All methods findAll will exhaust the pages one after this other under the hood.

  • Build your query with Query
Query query = Query.builder()
  .selectAll()
  .where("firstName").isEqualsTo("John")
  .and("lastName").isEqualsTo("Connor")
  .build();
  • Retrieve your Stream<T> with findAll(Query query), if you do provide any marshaller you get a Json String.
Stream<Document<String>> result = cp.findAll(query);
  • Retrieve all documents of a collection is possible, it is the default query
// Get all documents
Stream<Document<String>> allDocs1 = cp.findAll();

// Equivalent to 
Stream<Document<String>> allDocs2 = cp.findAll(Query.builder().build());
  • Retrieve your Stream<T> with findAll(Query query, Class<T> class) using default Jackson Mapper
Stream<Document<Person>> res1 = cp.findAll(query, Person.class);
  • Retrieve your Page<T> with findAll(PageableQuery query, DocumentMapper<T>) using your custom mapping
public static class PersonMapper implements DocumentMapper<Person> {
  @Override
  public Person map(String record) {
    Person p = new Person();
    // custom logic
    return p;
  }    
}

Stream<Document<Person>> page1 = cp.findAll(query, new PersonMapper());

Here the class definitions for beans used in the samples.

✅. Get a document by id

// doc1 is the document Id in the collection
boolean docExist = colPersonClient.document("doc1").exist();

// Find returns an optional
Optional<Person> p = colPersonClient.document("doc1").find(Person.class);

✅. Create a new document with no id

// Define an object
Person john = new Person("John", "Doe", 20, new Address("Paris", 75000));

// As no id has been provided, the API will create a UUID and returned it to you 
String docId = colPersonClient.createNewDocument(john);

✅. Upsert a document enforcing the id

// Define an object
Person john2 = new Person("John", "Doe", 20, new Address("Paris", 75000));

// Now the id is provided (myId) and we can upsert
String docId = colPersonClient.document("myId").upsert(john2, Person.class);

✅. Delete a Document from its id

colPersonClient.document("myId").delete();

✅. Count Documents

🚨 This operation can be slow. Every query to the API os paged. The method will fetch pages (limited the payloads size as much as possible) as long as they are and finally count the results.

int docNum = colPersonClient.count();

✅. Find part of a documents

The document API allows to work with nested structure in document. You are asked to provide the path in the URL

http://{doc-api-endpoint}/namespaces/{namespace-id}/collections/{collection-id}/{document-id}/{document-path}

Given a Json DOCUMENT with UUID e8c5021b-2c91-4015-aec6-14a16e449818 :

{ 
  "age": 25,
  "firstname": "PersonAstra5",
  "lastname": "PersonAstra1",
  "address": {
    "city": "Paris",
    "zipCode": 75000
   },
}

You can retrieve the zipCode with: http://{doc-api-endpoint}/namespaces/ns1/collections/person/e8c5021b-2c91-4015-aec6-14a16e449818/address/zipCode

The SDK provide some utility methods to work with this as well:

// Retrieve an object and marshall
Optional<Address> address = colPersonClient
   .document("e8c5021b-2c91-4015-aec6-14a16e449818")
   .findSubDocument("address", Address.class);
        
// Retrieve a scalar deeper in the tree
Optional<Integer> zipcode = colPersonClient
  .document("e8c5021b-2c91-4015-aec6-14a16e449818")
  .findSubDocument("address/zipCode", Integer.class);

✅. Update a sub document

// Update an existing attribute of the JSON
colPersonClient.document("e8c5021b-2c91-4015-aec6-14a16e449818")
               .updateSubDocument("address", new Address("city2", 8000));

// Create a new attribute in the document
colPersonClient.document("e8c5021b-2c91-4015-aec6-14a16e449818")
               .updateSubDocument("secondAddress", new Address("city2", 8000));

✅. Delete part of a documents

colPersonClient.document("e8c5021b-2c91-4015-aec6-14a16e449818")
               .deleteSubDocument("secondAddress");

Document Repository

📘. StargateDocumentRepository overview

If you have work with Spring Data or Active Record before you might already know what the repository are. Those are classes that provides you CRUD (create, read, update, delete) operations without you having to code anything.

Here this is not different, if you provide an object for a collection this is what is available for you

public interface StargateDocumentRepository <DOC> {
   
   // Create
   String insert(DOC p);
   void insert(String docId, DOC doc);
   
   // Read unitary
   boolean exists(String docId);
   Optional<DOC> find(String docId);

   // Read records
   int count();
   DocumentResultPage<DOC> findPage();
   DocumentResultPage<DOC> findPage(SearchDocumentQuery query) ;
   Stream<ApiDocument<DOC>> findAll();
   Stream<ApiDocument<DOC>> findAll(SearchDocumentQuery query);

  // Update
  void save(String docId, DOC doc);

  // Delete
  void delete(String docId);
}

✅. Initialization of repository

// Initialization (from namespaceClients)
NamespaceClient ns1Client = astraClient.apiStargateDocument().namespace("ns1");
StargateDocumentRepository<Person> personRepository1 = 
  new StargateDocumentRepository<Person>(ns1Client, Person.class);

Points to note:

  • No collection name is provided here. By default the SDK will use the class name in lower case (here person)
  • If you want to override the collection name you can annotate your bean Person with @Collection("my_collection_name")
// Initialization from CollectionClient, no ambiguity on collection name
CollectionClient colPersonClient = astraClient.apiStargateDocument()
 .namespace("ns1").collection("person");
StargateDocumentRepository<Person> personRepository2 = 
  new StargateDocumentRepository<Person>(colPersonClient, Person.class);

✅. CRUD

We assume that the repository has been initialized as describe above and name personRepo.

if (!personRepo.exists("Cedrick")) {
  personRepo.save("Cedrick", new Person("Cedrick", "Lunven", new Address()));
}

// Yeah
personRepository.findAll()                     // Stream<ApiDocument<Person>>      
                .map(ApiDocument::getDocument) // Stream<Person>      
                .map(PersonRepo::getFirstname) // Stream<String>
                .forEach(System.out::println);