Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update authorization and pagination docs #1814

Open
wants to merge 20 commits into
base: source
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 72 additions & 32 deletions src/pages/learn/authorization.mdx
Original file line number Diff line number Diff line change
@@ -1,52 +1,92 @@
# Authorization

> Delegate authorization logic to the business logic layer
<p className="learn-subtitle">Delegate authorization logic to the business logic layer</p>

Most APIs will need to secure access to certain types of data depending on who requested it, and GraphQL is no different. GraphQL execution should begin after [authentication](/graphql-js/authentication-and-express-middleware/) middleware confirms the user's identity and passes that information to the GraphQL layer. But after that, you still need to determine if the authenticated user is allowed to view the data provided by the specific fields that were included in the request. On this page, we'll explore how a GraphQL schema can support authorization.

## Type and field authorization

Authorization is a type of business logic that describes whether a given user/session/context has permission to perform an action or see a piece of data. For example:

_"Only authors can see their drafts"_

Enforcing this kind of behavior should happen in the [business logic layer](/learn/thinking-in-graphs/#business-logic-layer). It is tempting to place authorization logic in the GraphQL layer like so:
Enforcing this behavior should happen in the [business logic layer](/learn/thinking-in-graphs/#business-logic-layer). Let's consider the following `Post` type defined in a schema:

```graphql
type Post {
authorId: ID!
body: String
}
```

In this example, we can imagine that when a request initially reaches the server, authentication middleware will first check the user's credentials and add information about their identity to the `context` object of the GraphQL request so that this data is available in every field resolver for the duration of its execution.

If a post's body should only be visible to the user who authored it, then we will need to check that the authenticated user's ID matches the post's `authorId` value. It may be tempting to place authorization logic in the resolver for the post's `body` field like so:

```js
const postType = new GraphQLObjectType({
name: 'Post',
fields: {
body: {
type: GraphQLString,
resolve(post, args, context, { rootValue }) {
// return the post body only if the user is the post's author
if (context.user && (context.user.id === post.authorId)) {
return post.body
}
return null
}
}
function Post_body(obj, args, context, info) {
// return the post body only if the user is the post's author
if (context.user && (context.user.id === obj.authorId)) {
return obj.body
}
})
return null
}
```

Notice that we define "author owns a post" by checking whether the post's `authorId` field equals the current user’s `id`. Can you spot the problem? We would need to duplicate this code for each entry point into the service. Then if the authorization logic is not kept perfectly in sync, users could see different data depending on which API they use. Yikes! We can avoid that by having a [single source of truth](/learn/thinking-in-graphs/#business-logic-layer) for authorization.
Notice that we define "author owns a post" by checking whether the post's `authorId` field equals the current user’s `id`. Can you spot the problem? We would need to duplicate this code for each entry point into the service. Then if the authorization logic is not kept perfectly in sync, users could see different data depending on which API they use. Yikes! We can avoid that by having a [single source of truth](/learn/thinking-in-graphs/#business-logic-layer) for authorization, instead of putting it the GraphQL layer.

Defining authorization logic inside the resolver is fine when learning GraphQL or prototyping. However, for a production codebase, delegate authorization logic to the business logic layer. Here’s an example:
Defining authorization logic inside the resolver is fine when learning GraphQL or prototyping. However, for a production codebase, delegate authorization logic to the business logic layer. Here’s an example of how authorization of the `Post` type's fields could be implemented separately:

```js
// Authorization logic lives inside postRepository
const postRepository = require('postRepository');

const postType = new GraphQLObjectType({
name: 'Post',
fields: {
body: {
type: GraphQLString,
resolve(post, args, context, { rootValue }) {
return postRepository.getBody(context.user, post)
}
// authorization logic lives inside `postRepository`
export const postRepository = {
getBody({ user, post }) {
if (user?.id && (user.id === post.authorId)) {
return post.body
}
return null
}
})
}
```

The resolver function for the post's `body` field would then call a `postRepository` method instead of implementing the authorization logic directly:

```js
import { postRepository } from 'postRepository'

function Post_body(obj, args, context, info) {
// return the post body only if the user is the post's author
return postRepository.getBody({ user: context.user, post: obj })
}
```

In the example above, we see that the business logic layer requires the caller to provide a user object. If you are using GraphQL.js, the User object should be populated on the `context` argument or `rootValue` in the fourth argument of the resolver.
In the example above, we see that the business logic layer requires the caller to provide a user object, which is available in the `context` object for the GraphQL request. We recommend passing a fully-hydrated user object instead of an opaque token or API key to your business logic layer. This way, we can handle the distinct concerns of [authentication](/graphql-js/authentication-and-express-middleware/) and authorization in different stages of the request processing pipeline.

## Using type system directives

In the example above, we saw how authorization logic can be delegated to the business logic layer through a function that is called in a field resolver.
Copy link
Member

@benjie benjie Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following on from https://github.com/graphql/graphql.github.io/pull/1814/files#r1852923240

Instead of trying to fit the auth directive approach into the "business logic layer" narrative, let's explicitly call it out as an alternative - something that people do but is not the recommended way:

Suggested change
In the example above, we saw how authorization logic can be delegated to the business logic layer through a function that is called in a field resolver.
In the example above, we saw how authorization logic can be delegated to the business logic layer through a function that is called in a field resolver. In general it is recommended to perform all authorization logic in the business logic layer, however if you decide to implement authorization in the GraphQL layer instead then one approach is to use [type system dire...

(merge with next paragraph)

The remainder of the text in this section would also need light editorial to reflect this.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, addressed in 587906a.


Another approach when implementing authorization checks for a GraphQL API is to use [type system directives](/learn/schema/#directives), where a directive such as `@auth` is defined in the schema with arguments that can indicate what roles or permissions a user must have to access the data provided by the and fields where the directive is applied. For example:

```graphql
directive @auth(rule: Rule) on FIELD_DEFINITION

enum Rule {
IS_AUTHOR
}

type Post {
authorId: ID!
body: String @auth(rule: IS_AUTHOR)
}
```

It would be up to the GraphQL implementation to determine how an `@auth` directive affects execution when a client makes a request that includes the `body` field for `Post` type. However, the authorization logic should remain delegated to the business logic layer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling to see how this leaves the logic in the business logic layer - isn't it the GraphQL layer here that's dictating that the rule to use is IS_AUTHOR rather than the business logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this one is a bit ambiguous. I thought it was worth mentioning the type system directive approach because I've seen it used so often (whether or not it's a great idea to mix up auth rules in a schema). But I struggled to come up with a good example here that was adjacent to the prior example and also adheres to what would be considered the best practice. If you'd rather not invite readers to consider this option, I can just remove the section. Alternatively, I could remove the example and just provide a couple sentences explaining that people may see the type system directive approach in the wild. Let me know.


## Recap

To recap these recommendations for authorization in GraphQL:

We recommend passing a fully-hydrated User object instead of an opaque token or API key to your business logic layer. This way, we can handle the distinct concerns of [authentication](/graphql-js/authentication-and-express-middleware/) and authorization in different stages of the request processing pipeline.
- Authorization logic should be delegated to the business logic layer, not the GraphQL layer
- After execution begins, a GraphQL server should make decisions about whether the client that made the request is authorized to access data for the included fields
- Type system directives may be defined and added to the types and fields in a schema to apply generalized authorization rules
82 changes: 61 additions & 21 deletions src/pages/learn/pagination.mdx
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# Pagination

> Different pagination models enable different client capabilities
<p className="learn-subtitle">Traverse lists of objects with a consistent field pagination model</p>

A common use case in GraphQL is traversing the relationship between sets of objects. There are a number of different ways that these relationships can be exposed in GraphQL, giving a varying set of capabilities to the client developer.
A common use case in GraphQL is traversing the relationship between sets of objects. There are different ways that these relationships can be exposed in GraphQL, giving a varying set of capabilities to the client developer. On this page, we'll explore how fields may be paginated using a cursor-based connection model.

## Plurals

The simplest way to expose a connection between objects is with a field that returns a plural type. For example, if we wanted to get a list of R2-D2's friends, we could just ask for all of them:
The simplest way to expose a connection between objects is with a field that returns a plural [List type](/learn/schema/#list). For example, if we wanted to get a list of R2-D2's friends, we could just ask for all of them:

```graphql
# { "graphiql": true }
{
query {
hero {
name
friends {
Expand All @@ -22,10 +22,10 @@ The simplest way to expose a connection between objects is with a field that ret

## Slicing

Quickly, though, we realize that there are additional behaviors a client might want. A client might want to be able to specify how many friends they want to fetch; maybe they only want the first two. So we'd want to expose something like:
Quickly, though, we realize that there are additional behaviors a client might want. A client might want to be able to specify how many friends they want to fetchmaybe they only want the first two. So we'd want to expose something like this:

```graphql
{
query {
hero {
name
friends(first: 2) {
Expand All @@ -37,20 +37,22 @@ Quickly, though, we realize that there are additional behaviors a client might w

But if we just fetched the first two, we might want to paginate through the list as well; once the client fetches the first two friends, they might want to send a second request to ask for the next two friends. How can we enable that behavior?

## Pagination and Edges
## Pagination and edges

There are a number of ways we could do pagination:
There are several ways we could do pagination:

- We could do something like `friends(first:2 offset:2)` to ask for the next two in the list.
- We could do something like `friends(first:2 after:$friendId)`, to ask for the next two after the last friend we fetched.
- We could do something like `friends(first:2 after:$friendCursor)`, where we get a cursor from the last item and use that to paginate.

In general, we've found that **cursor-based pagination** is the most powerful of those designed. Especially if the cursors are opaque, either offset or ID-based pagination can be implemented using cursor-based pagination (by making the cursor the offset or the ID), and using cursors gives additional flexibility if the pagination model changes in the future. As a reminder that the cursors are opaque and that their format should not be relied upon, we suggest base64 encoding them.
The approach described in the first bullet is classic _offset-based pagination_. However, this style of pagination can have performance and security downsides, especially for larger data sets. Additionally, if new records are added to the database after the user has made a request for a page of results, then offset calculations for subsequent pages may become ambiguous.

That leads us to a problem; though; how do we get the cursor from the object? We wouldn't want cursor to live on the `User` type; it's a property of the connection, not of the object. So we might want to introduce a new layer of indirection; our `friends` field should give us a list of edges, and an edge has both a cursor and the underlying node:
In general, we've found that _cursor-based pagination_ is the most powerful of those designed. Especially if the cursors are opaque, either offset or ID-based pagination can be implemented using cursor-based pagination (by making the cursor the offset or the ID), and using cursors gives additional flexibility if the pagination model changes in the future. As a reminder that the cursors are opaque and their format should not be relied upon, we suggest base64 encoding them.

But that leads us to a problem—how do we get the cursor from the object? We wouldn't want the cursor to live on the `User` type; it's a property of the connection, not of the object. So we might want to introduce a new layer of indirection; our `friends` field should give us a list of edges, and an edge has both a cursor and the underlying node:

```graphql
{
query {
hero {
name
friends(first: 2) {
Expand All @@ -67,14 +69,14 @@ That leads us to a problem; though; how do we get the cursor from the object? We

The concept of an edge also proves useful if there is information that is specific to the edge, rather than to one of the objects. For example, if we wanted to expose "friendship time" in the API, having it live on the edge is a natural place to put it.

## End-of-list, counts, and Connections
## End-of-list, counts, and connections

Now we have the ability to paginate through the connection using cursors, but how do we know when we reach the end of the connection? We have to keep querying until we get an empty list back, but we'd really like for the connection to tell us when we've reached the end so we don't need that additional request. Similarly, what if we want to know additional information about the connection itself; for example, how many total friends does R2-D2 have?
Now we can paginate through the connection using cursors, but how do we know when we reach the end of the connection? We have to keep querying until we get an empty list back, but we'd like for the connection to tell us when we've reached the end so we don't need that additional request. Similarly, what if we want additional information about the connection itself, for example, how many friends does R2-D2 have in total?

To solve both of these problems, our `friends` field can return a connection object. The connection object will then have a field for the edges, as well as other information (like total count and information about whether a next page exists). So our final query might look more like:
To solve both of these problems, our `friends` field can return a connection object. The connection object will be an Object type that has a field for the edges, as well as other information (like total count and information about whether a next page exists). So our final query might look more like this:

```graphql
{
query {
hero {
name
friends(first: 2) {
Expand All @@ -96,20 +98,50 @@ To solve both of these problems, our `friends` field can return a connection obj

Note that we also might include `endCursor` and `startCursor` in this `PageInfo` object. This way, if we don't need any of the additional information that the edge contains, we don't need to query for the edges at all, since we got the cursors needed for pagination from `pageInfo`. This leads to a potential usability improvement for connections; instead of just exposing the `edges` list, we could also expose a dedicated list of just the nodes, to avoid a layer of indirection.

## Complete Connection Model
## Complete connection model

Clearly, this is more complex than our original design of just having a plural! But by adopting this design, we've unlocked a number of capabilities for the client:
Clearly, this is more complex than our original design of just having a plural! But by adopting this design, we've unlocked several capabilities for the client:

- The ability to paginate through the list.
- The ability to ask for information about the connection itself, like `totalCount` or `pageInfo`.
- The ability to ask for information about the edge itself, like `cursor` or `friendshipTime`.
- The ability to change how our backend does pagination, since the user just uses opaque cursors.

To see this in action, there's an additional field in the example schema, called `friendsConnection`, that exposes all of these concepts. You can check it out in the example query. Try removing the `after` parameter to `friendsConnection` to see how the pagination will be affected. Also, try replacing the `edges` field with the helper `friends` field on the connection, which lets you get directly to the list of friends without the additional edge layer of indirection, when that's appropriate for clients.
To see this in action, there's an additional field in the example schema, called `friendsConnection`, that exposes all of these concepts:

```graphql
interface Character {
id: ID!
name: String!
friends: [Character]
friendsConnection(first: Int, after: ID): FriendsConnection!
appearsIn: [Episode]!
}

type FriendsConnection {
totalCount: Int
edges: [FriendsEdge]
friends: [Character]
pageInfo: PageInfo!
}

type FriendsEdge {
cursor: ID!
node: Character
}

type PageInfo {
startCursor: ID
endCursor: ID
hasNextPage: Boolean!
}
```

You can try it out in the example query. Try removing the `after` argument for the `friendsConnection` field to see how the pagination will be affected. Also, try replacing the `edges` field with the helper `friends` field on the connection, which lets you get directly to the list of friends without the additional edge layer of indirection, when appropriate for clients:

```graphql
# { "graphiql": true }
{
query {
hero {
name
friendsConnection(first: 2, after: "Y3Vyc29yMQ==") {
Expand All @@ -129,6 +161,14 @@ To see this in action, there's an additional field in the example schema, called
}
```

## Connection Specification
## Connection specification

To ensure a consistent implementation of this pattern, the Relay project has a formal [specification](https://facebook.github.io/relay/graphql/connections.htm) you can follow for building GraphQL APIs that use a cursor-based connection pattern.
mandiwise marked this conversation as resolved.
Show resolved Hide resolved

## Recap

To recap these recommendations for paginating fields in a GraphQL schema:

To ensure a consistent implementation of this pattern, the Relay project has a formal [specification](https://facebook.github.io/relay/graphql/connections.htm) you can follow for building GraphQL APIs which use a cursor based connection pattern.
- List fields that may return a lot of data should be paginated
- Cursor-based pagination provides a stable pagination model for fields in a GraphQL schema
- The cursor connection specification from the Relay project provides a consistent pattern for paginating the fields in a GraphQL schema