-
Notifications
You must be signed in to change notification settings - Fork 809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update documentation around TinkerPop HTTP API and serializers. #2908
base: master
Are you sure you want to change the base?
Conversation
generalized object serialization format. That characteristic makes it useful as a serialization format for Gremlin | ||
Server where arbitrary objects of varying types may be returned as results. However, starting in GraphSON 4, GraphSON | ||
is only intended to be a network serialization format that is only able to serialize specific types defined by the | ||
format. It is only meant to be used between language variants and the Gremlin Server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is only meant to be used between language variants and the Gremlin Server.
is it just language variants? that seems to rule out pure http based requests, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it fair to say that GraphSON 4 is a useful "graph format" for any graph which is limited to containing TinkerPop serializable types?
@@ -54,6 +54,168 @@ This document attempts to address the needs of the different providers that have | |||
* Graph Language Provider | |||
* Graph Plugin Provider | |||
|
|||
== HTTP API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something about the ordering of the Provider doc feels off to me. Someone implementing TinkerPop is the target reader. They likely would not start implementing by doing "HTTP API". The "Graph System Provider Requirements" are probably where most implementers would start their work. Could you give a bit of thought to organization and placement of this section?
I'd suggest a paragraph at the end of the "Provider Documentation" section above that says something like:
Deciding which of the interfaces, protocols, and tests to implement depends largely on where your work fits in the TinkerPop system and how advanced you intend for that implementation to be. The following bullet points detail common provider projects that implementers undertake and what parts of the documentation will be most relevant to them in getting started:
- Building a TinkerPop compliant graph database
- Building a Gremlin Server implementation to provide remote connectivity over HTTP and with Gremlin drivers
- Building a driver to work with Gremlin Server implementations
- Building a Gremlin execution engine
The bullets are just possible sections that could likely replace the bullets above that just list the types of providers. Each could have some sort of introductory paragraph that gives an overview of what's involved in those tasks, links to docs we have and other hints to get started. I think that would introduce the sections below much better.
|Key |Description |Value |Required | ||
|gremlin |The Gremlin query to execute. |String containing script |Yes | ||
|timeoutMs |The maximum time a query is allowed to execute in milliseconds. |Number between 0 and 2^31-1 |No | ||
|bindings |Any bindings used to execute the query. |Object (Map) |No |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you clarify "bindings" a bit? or link to their meaning if described elsewhere?
|g |The name of the graph traversal source to which the query applies. Default: "g" |String containing traversal source name |No | ||
|language |The name of the ScriptEngine to use to parse the gremlin query. Default: "gremlin-lang" |String containing ScriptEngine name |No | ||
|materializeProperties |Whether to include all properties for results. One of "tokens" or "all". |String |No | ||
|bulked |Whether the result should be "bulked" (only applies to GraphBinary) |Boolean |No |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you clarify "bulked" a bit? or link to their meaning if described elsewhere?
=== HTTP Examples | ||
|
||
For examples of actual requests and responses, take a look at the IO documentation for | ||
link:https://tinkerpop.apache.org/docs/3.7.3/dev/io/#_requestmessage[GraphSON requests] and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should these links be hardcoded to 3.7.3? prefer /x.y.z
?
[width="100%",cols="3,10,3,3",options="header"] | ||
!========================================================= | ||
!Name !Description !Required !Default | ||
!code !The actual status code of the result. !Number !Yes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link to the error codes?
@@ -971,19 +1133,16 @@ extensible nature of Gremlin Server, it is difficult to provide an authoritative | |||
It is however possible to describe the core communication protocol using the standard out-of-the-box configuration | |||
which should provide enough information to develop a driver for a specific language. | |||
|
|||
image::gremlin-server-flow.png[width=300,float=right] | |||
|
|||
Gremlin Server is distributed with a configuration that utilizes link:http://en.wikipedia.org/wiki/WebSocket[WebSocket] | |||
with a custom sub-protocol. Under this configuration, Gremlin Server accepts requests containing a Gremlin script, | |||
evaluates that script and then streams back the results. The notion of "streaming" is depicted in the diagram to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it appears your deleted the diagram, but the text still refers to it.
Let's use the incoming request to process the Gremlin script of `g.V()` as an example. Gremlin Server evaluates that | ||
script, getting an `Iterator` of vertices as a result, and steps through each `Vertex` within it. The vertices are | ||
batched together into an HTTP chunk. Each response is serialized given the requested serializer type (GraphBinary is | ||
recommended) and written back to the requesting client immediately. Gremlin Server does not wait for the entire result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the docs keep mentioning "GraphBinary is recommended" but do we say anywhere why this is the case? ultimately, GraphBinary will effectively be the only option for drivers, so perhaps the docs should just be more clear that this is just a M1 recommendation?
|op |The name of the "operation" to execute based on the available `OpProcessor` configured in the Gremlin Server. To evaluate a script, use `eval`. | ||
|processor |The name of the `OpProcessor` to utilize. The default `OpProcessor` for evaluating scripts is unnamed and therefore script evaluation purposes, this value can be an empty string. | ||
|args |A `Map` of arbitrary parameters to pass to Gremlin Server. The requirements for the contents of this `Map` are dependent on the `op` selected. | ||
|gremlin |The script to be executed by the server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this table a duplicate of what was in the HTTP section? any reason to have it in both places? shouldn't this just link to the HTTP section of the table? also, note this version of the table still references the "op" code.
|========================================================= | ||
|
||
This message can be serialized in any fashion that is supported by Gremlin Server. New serialization methods can | ||
be plugged in by implementing a `ServiceLoader` enabled `MessageSerializer`, however Gremlin Server provides for | ||
This message can be serialized in any fashion that is supported by Gremlin Server. Gremlin Server provides for | ||
JSON serialization by default which will be good enough for purposes of most developers building drivers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gremlin Server provides for JSON serialization by default which will be good enough for purposes of most developers building drivers.
again, not sure we should be recommending JSON serialization drivers going forward, particularly for building them (probably not even a good statement in the old docs - probably a holdover from the Gryo days).
"language":"gremlin-groovy"}} | ||
{ "gremlin":"g.V(x).out()", | ||
"bindings":{"x":1}, | ||
"language":"gremlin-groovy"}} | ||
---- | ||
|
||
The above JSON represents the "body" of the request to send to Gremlin Server. When sending this "body" over |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more on the point of JSON - you called it a pseudo-JSON earlier in your writing. perhaps it is a "pseudo-request" around which you write this section of doc?
|requestId |The identifier of the `RequestMessage` that generated this `ResponseMessage`. | ||
|status | The `status` contains a `Map` of three keys: `code` which refers to a `ResultCode` that is somewhat analogous to an link:http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html[HTTP status code], `attributes` that represent a `Map` of protocol-level information, and `message` which is just a human-readable `String` usually associated with errors. | ||
|result | The `result` contains a `Map` of two keys: `data` which refers to the actual data returned from the server (the type of data is determined by the operation requested) and `meta` which is a `Map` of meta-data related to the response. | ||
|status | The `status` contains a `Map` of three keys: `code` which refers to the link:http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html[HTTP status code], `exception` that is a `String` that describe the class of exception that the error falls ino, and `message` which is just a human-readable `String` usually associated with errors. For successul responses, only `status` is mandatory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again - i'm not sure we should duplicate the table and should just refer to the HTTP section. you can add an anchor or something to link directly to the individual tables. moreover there are details there that the driver developer will want to know - stuff like the trailing headers details perhaps?
one key value pair present (since only one `Traversal` is being submitted, there is no sense to having more than a | ||
single alias). | ||
|========================================================= | ||
All graph drivers are expected to support HTTP request intercepting. This means that the user of the graph driver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All graph drivers are expected to support HTTP request intercepting.
I think that statement is true of official TinkerPop drivers. that's an expectation we'd have. this documentation though is more of a guide for third-parties, so you are either:
- someone building a driver
- someone looking to build an interceptor
if you are (1) then i dont think we want to say that folks must do this to be a compliant driver. If they don't do it, then they should be aware that they possibly close off access to certain providers who might require the interceptor to connect. i think there is just a change of wording needed here.
if you are (2), i'm not sure you'd be here i guess. i suppose we have user docs in the drivers that cover this?
The IO test suite is a collection of files that contain the expected outcome of serialization of certain types. These | ||
tests can be used to determine if a particular serializer has been correctly implemented. In general, a driver should | ||
be able to "round trip" each of these types. That is, it should be able to both read from and write to those exact same | ||
bytes. There may be some limitations based the types available in your driver's language, so it is not always possible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be some limitations based the types available in your driver's language
"based on the types"?
so it is not always possible to round trip every type.
could you supply an example please to make it clear when that happens and what folks would do in such a case?
} | ||
|
||
The above example showed a `GET` operation, but the preferred method for this endpoint is `POST`: | ||
`POST` is the only supported method for the endpoint: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to the diff this reads as:
Once the server has started, issue a request. Here’s an example with cURL:
POST is the only supported method for the endpoint:
need to clean something up there.
and fail to always communicate. Discrepancy in serializer registration between client and server can happen fairly | ||
easily as different graph systems may automatically include serializers on the server-side, thus leaving the client | ||
to be configured manually. As an example: | ||
there are two options for serialization: GraphSON and GraphBinary. Note, however, that starting in the full release of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should have a standard warning banner for stuff that only expected in a particular milestone, that way we can write what's intended and it will be easy to search for banners to remove them later?
IMPORTANT: 4.0 Milestone Release - There is temporary support for GraphSON in the Java driver which will help with testing, but it is expected that the drivers will only support GraphBinary when GA is released.
something like that?
@@ -1024,6 +1022,26 @@ g2Client.submit("g.V()") | |||
The above code demonstrates how the `alias` method can be used such that the script need only contain a reference | |||
to "g" and "g1" and "g2" are automatically rebound into "g" on the server-side. | |||
|
|||
==== RequestInterceptor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be good to have a simple example here direct in the docs or link to one in:
https://github.com/apache/tinkerpop/tree/master/gremlin-driver/src/main/java/examples
speaking of which, are those up to date?
request interceptor. Refer to your provider's documentation to determine if other authentication mechanisms are | ||
available. | ||
|
||
==== Transactions Disabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm, this is true of M1, but not what will be 4.0 GA. Are these upgrade docs for the M1 or for 4.0 GA? i kinda feel like its the latter and that we need a different place to call out milestone limitations. how about adding:
== TinkerPop 4.0.0.M1
The 4.0.0.M1 is a milestone release. It is for meant as a preview version to try out the new HTTP API features in the the server and drivers. As this is a milestone version only, you can expect breaking changes to occur in future milestones for 4.0.0 on the way to its General Availability release. The following sections detail important limitations and constraints pertinent to this milestone that may or may not apply to General Availability.
as a new section and anything that is specific to M1 goes in there with
for more detailed information. The subprotocol remains fairly similar but has been adjusted to work better with HTTP. | ||
Also, the move to HTTP means that SASL has been removed as an authentication mechanism and only HTTP basic remains. | ||
|
||
===== Request Interceptor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned elsewhere "recommended" is fine, but it might be help to be more explicit as to "why" by mentioning that not allowing for interceptor capabilities may preclude the driver from working with certain providers.
@@ -102,13 +104,11 @@ mime type is made explicit on requests to avoid breaking changes or unexpected r | |||
|
|||
Version 4.0 of GraphSON was first introduced on TinkerPop 4.0.0 and is represented by the | |||
`application/vnd.gremlin-v4.0+json` mime type. There also exists an untyped version: | |||
`application/vnd.gremlin-v3.0+json;types=false`. It is very similar to GraphSON 3.0, with just several key differences: | |||
`application/vnd.gremlin-v4.0+json;types=false`. It is very similar to GraphSON 3.0, with just several key differences: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`application/vnd.gremlin-v4.0+json;types=false`. It is very similar to GraphSON 3.0, with just several key differences: | |
`application/vnd.gremlin-v4.0+json;types=false`. It is very similar to GraphSON 4.0, with just several key differences: |
@@ -2988,8 +2980,6 @@ The following `ResponseMessage` is a typical example of the typical successful r | |||
} | |||
---- | |||
|
|||
=== Extended | |||
|
|||
Note that the "extended" types require the addition of the separate `GraphSONXModuleV4d0` module as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This note needs updating if we are discarding the Core and Extended types categories
Please look over this documentation change for either inaccurate information or missing information.
VOTE +1