[RFC] Add Appendix A: Persisted Documents #264

benjie · 2023-10-09T12:29:41Z

Addresses #38 (though not the "automatic" version; though that can be built on top of this spec).

This is a first draft.

JoviDeCroock

I am super excited to see this come to fruition! Should we remove the RFC?

JoviDeCroock · 2023-10-09T12:33:15Z

spec/Appendix A -- Persisted Documents.md

+
+The {operationName} parameter, if present, must be a string.
+
+Each of the {variables} and {extensions} parameters, if used, MUST be encoded as


Do we have to define expectations in case this exceeds maximum URL size?

If we do; we should do it in the main spec: https://graphql.github.io/graphql-over-http/draft/#sec-GET

The Appendix tries not to redundantly repeat statements from the main spec if it can avoid it.

Which reminds me; I heard that some people are using headers to specify variables when using GraphQL-over-GET... Apparently that works around the length limit 🤨

Yes, a lot of folks use headers and then the server adds them to the Vary response-headers so browsers/... can know that it is part of the cache-key

spec/Appendix A -- Persisted Documents.md

benjie · 2023-10-09T12:49:19Z

I think the RFC document still warrants discussion; let's leave it until after the meeting.

benjie · 2023-10-09T12:55:21Z

Added to October 26th agenda: https://github.com/graphql/graphql-over-http/blob/main/working-group/agendas/2023/2023-10-26.md

spec/Appendix A -- Persisted Documents.md

Shane32 · 2023-10-10T00:01:50Z

spec/Appendix A -- Persisted Documents.md

+
+The server should retrieve the GraphQL Document identified by the {documentId}
+parameter. If the server fails to retrieve the document, it MUST respond with a
+well-formed _GraphQL response_ consisting of a single error. Otherwise, it


I suggest we define the contents of the error sufficiently for the client to conclusively recognize.

There's not much scope to do that currently; the best that we can do is ensure that the error message starts with, ends with, or contains, a particular string. Error codes would need to be specified in the main GraphQL specification for us to use them here, and writing to extensions from an official specification has all the issues previously raised. Since this is only needed for APQ currently, I'm happy leaving it unspecified (and addressing it when APQ is added), but I would support the addition of a non-normative example of the error message.

+1 to non-normative example. Attempt:

Note: typically, the error allows recognizing failures to retrieve a document: \```json { "errors": [{ "message: "unknown document identifier" }] } \``` Implementation may add a dedicated error code in the response extensions as described in [the error result format](https://spec.graphql.org/draft/#sec-Errors.Error-Result-Format)

Was coming to ask if there was a consistent error when the document identifier is not found

Shane32 · 2023-10-10T00:04:17Z

spec/Appendix A -- Persisted Documents.md

+:: A _persisted document request_ is an HTTP request that encodes the following
+parameters in one of the manners described in this specification:
+
+- {documentId} - (_Required_, string): The string identifier for the Document.


query is omitted here and throughout. Makes sense, as it is not necessary for operation and should not be included. But I think APQ-type behavior should be considered and allowed for within the spec. Preferably, it is an optional feature as part of this spec, or otherwise include a note to the effect that query may be allowed in certain use cases, etc.

In an unofficial APQ request based on this specification, the query should go into extensions, and the error code used to detect the missing query should also go into the error's extensions. We may specify query in future if we officially specify something like APQ; that's definitely feasible over what we already have.

What are your opinions on providing the documentId via the URL /graphql/<documentId>?
One of the benefits would be easier debugging and visibility in dev tooling.

Edit: already covered via https://github.com/graphql/graphql-over-http/pull/264/files#diff-9be5577e05ae2112d2b8f95584b162d0dec01453bf6c85df58bf5db4f2c9727aR166-R168

Yes I think this should be encouraged more; it's great for caching. I'd welcome your edits to address this, if you were so inclined.

spec/Appendix A -- Persisted Documents.md

mohsen1 · 2023-10-10T09:11:36Z

spec/Appendix A -- Persisted Documents.md

+sha256:71f7dc5758652baac68e4a10c50be732b741c892ade2883a99358f52b555286b
+```
+
+### Custom Document Identifier


At Airbnb we append the operation name to the URL for the GraphQL requests for easier debugging and tracing. I wonder if this spec can include an example or implementation of such solutions:

I think it would be unwise to recommend that people do this based solely on the operation name (the risk of clashes as they iterate their queries is too high); however I would support the name being factored into the document identifier along with some hashing; e.g. something like: createHash('sha256').update(documentSource).digest('hex').substring(0, 12) + '_' + operationName. The operationName query parameter is already specified; perhaps we should add a non-normative note recommending that clients include it to aid in debugging?

dotansimha · 2023-10-10T13:32:08Z

spec/Appendix A -- Persisted Documents.md

+identification methods by ensuring that the prefix starts `x-`; otherwise, all
+prefixes are reserved for reasons of future expansion.
+
+### SHA256 Hex Document Identifier


I wonder if actual encryption is needed? what's the benefit of having sha256 over any other opaque string? 🤔
We can allow users how to encode/map their persisted documents, and then do matching based on that value?

The aim of a specification like this is interoperability; so a client that supports persisted operations should be able to use a server that supports persisted operations without too much additional setup. Sharing details of the document identification method used out-of-band is supported (explicitly by x- prefix, or by custom identifiers); but for maximum compatibility there should be a shared baseline in my opinion.

I expect that the benefit is mainly for APQ support, where a client can enable APQ without knowledge of the server implementation, and so a consistent implementation is essential. SHA256 can be computed natively by modern browsers with a very low collision rate, making it a good choice in this scenario.

However, in a scenario where the identifiers are only known to the server, and must be registered with the server in order for the client to operate (as so far is documented in this RFC), they might as well be any opaque key returned by the query storage database. If the database stores the queries with an auto-incrementing integer as an identifier, that would work just as well.

Even so, there are still some benefits to using a hash:

Should the query storage mechanism change, the identifiers will remain constant (but the same would be true of a GUID)

Inherit deduplication for stored queries

Enhanced security, as attackers can't easily guess valid identifiers, reducing the risk of unauthorized query execution

I'm fine with the current suggestion of sha256: but allowing for x-id: and similar.

I see, thank you @Shane32 @benjie .
My point is not about the prefix or the method but for the need for encryption? a user can decide to use operation-1 as the key (instead of an actual computed hash), and the result will be the same 🤔 🤔 🤔

@dotansimha Indeed, that's already allowed under this spec. The issue is it requires coordination between server and client (they need to agree on how operation-1 is derived). For maximum compatibility, if both server and client already know how to identify operations (e.g. standardized SHA256 hash) then all that's left is to transfer the docs from client to server, which can happen after the client has been built, rather than before or during, and no configuration is required on server or client easing adoption for everyone and encouraging more people to use an operation allowlist.

Oh, right, but the queries still need to be transferred to the server is what you're saying. Yes, that's true, but it can be done after the client is built (but before it's deployed). That's different from having to do it during the build process; it means that clients can be built and persisted documents written even before the server exists. It also allows for arbitrary transfer of documents to the server (you can send them one-way on a pen drive through the mail if you want!).

So the client and the server must be coordinated, and we recommend to use SHA256, right?

"Recommend" and "should" are the same according to RFC2119, so I think that is saying what's already there. If the server follows this recommendation, the client doesn't need any configuration. If the server doesn't do this; then you need to coordinate between client and server. For optimal interoperability, no coordination should be necessary.

To be pedantic there, I'd argue that sharing the "sha256" method is still coordination between the client and server. Plus the documents need to be actually transferred (on a pen drive or avian carrier!).

So coordination is always required but sha256 is a convenient and widespread default which we recommend (hence the should)?

Indeed, it is a very light-touch asynchronous form of coordination. Essentially the coordination boils down to two things: 1. does the server support SHA256 hashes (don't necessarily need to ask the server this, it's a fact that should be established in the development team); 2. we need a way to send the operations+hashes to the server and to know when they have been persisted.

Client is informed that server supports SHA256 hashes.
Client: performs build including generation of hashes
Client (at some point later): ships hashes and operations to server somehow
Server (upon receipt): stores queries+hashes somehow
Client (after server has stored): is deployed

Since the client can't be deployed until the server has stored the queries/hashes, there is indeed coordination. The coordination is incredibly lightweight compared to alternatives where the server must generate hashes during the client build process.

benjie · 2023-10-10T13:44:30Z

Thanks for your feedback everyone! I've adopted some of the feedback and replied to others. Keep it coming!

mcollina

lgtm, this is amazing

benjie · 2023-10-12T15:00:19Z

I've also written a piece on "trusted documents" - essentially the type of persisted documents that you can build an allowlist with (also known as "persisted queries" and various other names):

https://benjie.dev/graphql/trusted-documents

My hope is that "trusted documents" can become the preferred term when persisted documents are used as an allowlist (i.e. where your developers have written them), since it cannot be confused with other techniques such as "automatic persisted queries" (APQ), since these techniques involve no trust.

martinbonnin · 2023-10-12T16:49:48Z

Regarding documentId, would it make sense to add it to the main spec spec page as well? Somewhere around here, in the "Request parameters" section?

If documentId is a reserved key, might as well allocated it in the main spec page? Going to an appendix for that feels a bit off.

benjie · 2023-10-13T10:31:43Z

@martinbonnin I see what you mean, but documentId is meaningless to a GraphQL-over-HTTP request; it's only relevant to a persisted document request. I guess we could add it as a "reserved key" ("a GraphQL-over-HTTP request must never contain a property documentId"); but at the moment all unspecified keys are effectively reserved (if you need to add a key, it should go in extensions) so that seems redundant.

martinbonnin · 2023-10-13T11:01:56Z

@benjie right. Maybe "Appendix B: Reserved keys" 🙃? I don't really know...
It'd be interesting to compare to what other protocols with optional features have done in the past. I don't have an example at hand right now but I'll keep looking.

In addition to the `doc_id` field documented in the relay docs and the apollo client extension format. The draft spec appendix (graphql/graphql-over-http#264) uses the `documentId` key. part of GB-6253 (the second part is docs)

…#1557) In addition to the `doc_id` field documented in the relay docs and the apollo client extension format. The draft spec appendix (graphql/graphql-over-http#264) uses the `documentId` key. part of GB-6253 (the second part is docs)

spec/Appendix A -- Persisted Documents.md

JoviDeCroock · 2024-05-07T18:01:40Z

spec/Appendix A -- Persisted Documents.md

+identified document within a _persisted document request_ and know that it is
+trusted.
+
+Note: When used solely as a bandwidth optimization, an error-based mechanism


As we hint here at APQ should we have an opinion on the runtime vs build time generation of documents where we explicitly state that build-time gets you all the security benefits while runtime does not?

I'm happy with the wording as-is to cover this. Note that Persisted Documents are not necessarily trusted documents. All trusted documents are persisted documents, but not all persisted documents are trusted documents.

Should we mention "auto persisted queries" explicitely?

Note: When used solely as a bandwidth optimization, sometimes referred as "auto persisted queries", ...

n1ru4l · 2024-05-21T15:29:13Z

spec/Appendix A -- Persisted Documents.md

+Note: When persisting a document it is generally good practice for the client to
+issue both the GraphQL Document and the document identifier to the server; the
+server would then regenerate the document identifier from the GraphQL Document
+independently, and check that the identifiers match before storing the Document.
+An alternative but equally valid approach has the client issue the GraphQL
+Document to the server, and the server returns an arbitrary _custom document
+identifier_ that the client would incorporate into its bundle.


We should write here that the operation identifier to persisted document mapping is usually a build artifact shared by the graphql client and graphql server.

The server can use an external "store" as the source for resolving a document identifier to an actual document

Would you like to propose edits in the form of a PR? Perhaps you're just suggesting an extension to this paragraph, such as:

identifier_ that the client would incorporate into its bundle. Either way, the operation identifier and persisted document mappings are usually a build artefact shared by the client and server.

Note: The server will typically retrieve persisted documents from a "store" at run-time, using the identifier as the lookup. The store could be files on the filesystem, a database, a durable in-memory key-value store, or anywhere else suitable for retrieving a value by a key.

spec/Appendix A -- Persisted Documents.md

martinbonnin · 2024-05-26T06:17:40Z

spec/Appendix A -- Persisted Documents.md

+identified document within a _persisted document request_ and know that it is
+trusted.
+
+Note: When used solely as a bandwidth optimization, an error-based mechanism


Should we mention "auto persisted queries" explicitely?

Note: When used solely as a bandwidth optimization, sometimes referred as "auto persisted queries", ...

spec/Appendix A -- Persisted Documents.md

martinbonnin · 2024-05-26T06:38:41Z

spec/Appendix A -- Persisted Documents.md

+
+The server should retrieve the GraphQL Document identified by the {documentId}
+parameter. If the server fails to retrieve the document, it MUST respond with a
+well-formed _GraphQL response_ consisting of a single error. Otherwise, it


+1 to non-normative example. Attempt:

Note: typically, the error allows recognizing failures to retrieve a document: \```json { "errors": [{ "message: "unknown document identifier" }] } \``` Implementation may add a dedicated error code in the response extensions as described in [the error result format](https://spec.graphql.org/draft/#sec-Errors.Error-Result-Format)

spec/Appendix A -- Persisted Documents.md

martinbonnin · 2024-05-26T06:55:47Z

spec/Appendix A -- Persisted Documents.md

+identification methods by ensuring that the prefix starts `x-`; otherwise, all
+prefixes are reserved for reasons of future expansion.
+
+### SHA256 Hex Document Identifier


To be pedantic there, I'd argue that sharing the "sha256" method is still coordination between the client and server. Plus the documents need to be actually transferred (on a pen drive or avian carrier!).

So coordination is always required but sha256 is a convenient and widespread default which we recommend (hence the should)?

martinbonnin · 2024-05-26T06:57:15Z

spec/Appendix A -- Persisted Documents.md

+A _document identifier_ must either be a _prefixed document identifier_ or a
+_custom document identifier_.


Should we give a formal BNF syntax? Maybe restrict the identifiers to alpha numeric? GraphQL names maybe?

Would you like to submit a PR to my PR to add this?

Sounds like a plan. This week's quite busy but I'll aim for next week! (famous last words 😅 )

That was the longest week ever but attempt at formal syntax is here

Co-authored-by: Martin Bonnin <[email protected]>

JoviDeCroock · 2024-07-26T12:54:54Z

spec/Appendix A -- Persisted Documents.md

+Note: A common alternative pattern is to use a dedicated URL for each _persisted
+operation_ (e.g.
+`https://example.com/graphql/sha256:71f7dc5758652baac68e4a10c50be732b741c892ade2883a99358f52b555286b`).
+


Can we add a section on i.e. returning 404 for persisted-documents we can't find and maybe even 400 if they don't leverage an allowed prefix?

I think for the URL format 404 should be encouraged. I've been writing up a change to the appendix which encourages the URL format, but haven't had time to finish it yet; I've just raised a PR for my WIP so we have something to easily reference: #305

IMO for the non-URL version (traditional), 404 should not be used - it suggests that the /graphql endpoint is not found, which would be confusing.

Shane32 · 2024-08-23T20:35:25Z

This RFC for Persisted Documents as currently written is now supported in GraphQL.NET 8.0.1 and GraphQL.NET Server 8.0.1. Please try it out!

Notes:

Supported for GET, JSON POST, url-encoded POST (when enabled), multipart POST (when enabled), and websocket requests (both subscriptions-transport-ws and graphql-ws)
No automatic persisted query support has been added for this spec
Apollo Automatic Persisted Queries are still supported using the old spec (within extensions)
Does not yet support url path parsing such as demonstrated in Persisted Documents: encourage URL approach #305
Does not restrict character sets for document identifiers (see Add identifier syntax #296 )

Non-breaking changes to this RFC will be implemented within GraphQL.NET as needed/requested on an ongoing basis. Breaking changes within GraphQL.NET occur only upon major version releases.

Compress URL further

25c6968

benjie mentioned this pull request Oct 9, 2023

"Persisted queries" / "stored operations" #38

Open

JoviDeCroock approved these changes Oct 9, 2023

View reviewed changes

Add Appendix A: Persisted Documents

7863941

benjie force-pushed the persisted-documents branch from c08b7a1 to 7863941 Compare October 9, 2023 12:36

benjie mentioned this pull request Oct 9, 2023

Expand persisted operations discussion time #265

Merged

GET must not be used for mutations

a1a2c25

JoviDeCroock mentioned this pull request Oct 9, 2023

Remove mentions of APQ and Stripping ignored tokens from Persisted Operations RFC #266

Merged

Shane32 reviewed Oct 10, 2023

View reviewed changes

martinbonnin reviewed Oct 10, 2023

View reviewed changes

spec/Appendix A -- Persisted Documents.md Show resolved Hide resolved

mohsen1 reviewed Oct 10, 2023

View reviewed changes

benjie added 2 commits October 10, 2023 14:17

Specify character encoding.

6e9cb85

Rewrite 'Persisting a document' section

433b37f

dotansimha reviewed Oct 10, 2023

View reviewed changes

mcollina approved these changes Oct 10, 2023

View reviewed changes

dotansimha approved these changes Oct 12, 2023

View reviewed changes

benjie mentioned this pull request Oct 27, 2023

Make it clear that extra keys in the request/response payloads are not allowed #271

Closed

kitten mentioned this pull request Mar 2, 2024

feat: Add support for persisted documents via the documentId property urql-graphql/urql#3515

Merged

tomhoule mentioned this pull request Apr 4, 2024

engine-(v1,v2): allow receiving trusted document ids under documentId grafbase/grafbase#1557

Merged

JoviDeCroock reviewed May 7, 2024

View reviewed changes

spec/Appendix A -- Persisted Documents.md Show resolved Hide resolved

JoviDeCroock reviewed May 7, 2024

View reviewed changes

JoviDeCroock mentioned this pull request May 7, 2024

chore: mark Persisted-operations RFC as superseded by the appendix #291

Merged

n1ru4l reviewed May 21, 2024

View reviewed changes

martinbonnin reviewed May 26, 2024

View reviewed changes

benjie and others added 2 commits June 4, 2024 13:21

Apply suggestions from code review

6fbc6ed

Co-authored-by: Martin Bonnin <[email protected]>

Adopt some changes recommended via review

52d56fb

Shane32 mentioned this pull request Jun 4, 2024

[Feature] Add Persisted Document support graphql-dotnet/graphql-dotnet#3956

Closed

benjie mentioned this pull request Jun 19, 2024

Response body for non-well-formed GraphQL-over-HTTP requests #293

Closed

martinbonnin mentioned this pull request Jul 21, 2024

Add identifier syntax #296

Open

Shane32 mentioned this pull request Jul 22, 2024

Add persisted document support graphql-dotnet/graphql-dotnet#3993

Merged

JoviDeCroock reviewed Jul 26, 2024

View reviewed changes


		The {operationName} parameter, if present, must be a string.

		Each of the {variables} and {extensions} parameters, if used, MUST be encoded as

		A _document identifier_ must either be a _prefixed document identifier_ or a
		_custom document identifier_.

[RFC] Add Appendix A: Persisted Documents #264

Are you sure you want to change the base?

[RFC] Add Appendix A: Persisted Documents #264

Conversation

benjie commented Oct 9, 2023

JoviDeCroock left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benjie commented Oct 9, 2023 • edited Loading

benjie commented Oct 9, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Shane32 Oct 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

n1ru4l May 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benjie Oct 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dotansimha Oct 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benjie commented Oct 10, 2023

mcollina left a comment

Choose a reason for hiding this comment

benjie commented Oct 12, 2023

martinbonnin commented Oct 12, 2023

benjie commented Oct 13, 2023

martinbonnin commented Oct 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JoviDeCroock Jul 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Shane32 commented Aug 23, 2024 • edited Loading

JoviDeCroock left a comment •

edited

Loading

benjie commented Oct 9, 2023 •

edited

Loading

Shane32 Oct 10, 2023 •

edited

Loading

n1ru4l May 21, 2024 •

edited

Loading

benjie Oct 10, 2023 •

edited

Loading

dotansimha Oct 12, 2023 •

edited

Loading

martinbonnin commented Oct 13, 2023 •

edited

Loading

JoviDeCroock Jul 26, 2024 •

edited

Loading

Shane32 commented Aug 23, 2024 •

edited

Loading