(aka, stop torturing yourself with callbacks and chaining)
A common pattern for implementing asynchronous code flow is by the use of callbacks or futures (aka, promises, deferreds). Historically, this code can be either brittle or simply can be difficult to follow. In the node.js world of “no exceptions should go unhandled” is further complicated by handling and threading errors through callbacks.
Composers abstract the complexities of asynchronous programming.
Some benefits of this model are:
- Discards the need for managing futures, promises or callbacks in business logic code
- Error handling is easier to manage
- Blurs the line between same-process calls and remote service calls
- Provides timing, tracing and visual graph rendering for free (coming soon)
Some cons are:
- Namespaces for output keys are self-managed
- “Thinking in composers” has a bit of a learning curve from traditional callback patterns
- More verbose in some cases
Let's take a contrived, simple example of a dependency graph. Here, we are interested in the result of poking a user by id, which has multiple dependencies, some of which may be latent, remote calls.
poke userService pokeService
|
------> getUser(userId)
|
|
<------ user
|
|
------------------------> pokeUser(user.name, user.email)
|
|
<------------------------- result
A simple, callback-based example may look something like this:
// NOT composer code.
function getUserName(user) {
return user.name
}
function getUserEmail(user) {
return user.email
}
function getUser(id, callback) {
userService.getUser(id, function (err, user) {
return callback(err, user)
})
}
function pokeUser(id, callback) {
getUser(id, function (err, user) {
if (err) {
// Do something with error?
console.log(err)
return callback(err)
}
var name = getUserName(user)
var email = getUserEmail(user)
pokeService.poke(name, email, function (err, result) {
if (err) {
return callback(err)
}
callback(undefined, result.success)
})
})
}
pokeUser(userId, function(err, result) {
if (err) {
console.log(err)
return
}
if (result) {
// yay
}
})
The code above is clear to those familiar with Node's paradigms. However, it is prone-to-error, difficult to maintain and provides little information out-of-the-box during the course of its execution.
Another way to think of this is as a dependency graph. For example:
user-id
|
user
/ \
name email
\ /
poke-result
|
poke-user
In this example, poking a user depends on the result of poking a user. In order to poke a user, we need the user's name and email address. The user's name and email address depends on the user, and the user depends on a user id.
The variable here is the user id, which is supplied by a caller. This is how we think with composers. So, the same can be written as:
// Assuming services return promises, and do not use callbacks. Though Q makes
// it simple to create a promise out of a callback.
var composers = require('composers')
, registry = composers.Registry
, scope = composers.Scope
// Setup
var registry = new Registry()
registry.defineNode().given('user-id').outputs('user').with(function(userId) {
return userService.getUser(userId.get())
})
.build()
registry.defineNode().given('user').outputs('name').with(function(user) {
return user.get().getName()
})
.build()
registry.defineNode().given('user').outputs('email').with(function(user) {
return user.get().getEmail()
})
.build()
registry.defineNode().given('name', 'email').outputs('poke-result')
.with(function(name, email) {
return pokeService.poke(name.get(), email.get())
})
.build()
registry.defineNode().given('poke-user').outputs('poke-result')
.with(function(result) {
return result.get().success
})
.build()
// Runtime
var scope = new Scope(registry)
scope.enter()
scope.createGraph('poke-user').give('user-id', user).then(function (result) {
if (result.get()) {
// yay
}
}, function (err) {
console.log(err)
})
Few things to note from the comparison example above:
- Handlers can return values, promises, or deferreds from the Q library.
- Callbacks are non-existent in logical nodes.
- Exceptions propagate as expected.
- Values are evaluated from the leaves up, that is, the call path cannot be followed sequentially like the callback method.
Oh but why so verbose?? you ask... Well... you can always wrap and simplify it to your liking, I leave that up to you. :-)
node('user-id', 'user', function (userId) {
return userService.getUser(userId.get())
})
node('user', 'name', function (user) {
return user.get().getName()
})
node('user', 'email', function (user) {
return user.get().getName()
})
node(['name', 'email'], 'poke-result', function (name, email) {
return pokeService.poke(name.get(), email.get())
})
node('poke-user', 'poke-result', function (result) {
return result.get().success
})
compose('poke-user', {'user-id': userId}).then(function (result) {
if (result) {
// yay
}
})
The basic concepts of composers are:
- Nodes: Logically composes a set of inputs into a single output value.
- Graphs: Compute a single output value by dynamically constructing a dependency graph of nodes.
- Scopes: Execution contexts for graphs, scopes cache graph node outputs.
All inputs and outputs are defined by a unique key. A node can require any number of inputs, and it is guaranteed that all inputs will be available by the time the node function is executed. A node function can return a value or a promise for a value.
It's very important to note that developer code will not asynchronously execute OR block in the world of composers. All values are readily available when the composer executes. This greatly simplifies coding.
It can be difficult to think in terms of composers, due to their indirect nature. Notably, all nodes run from the "leaves on up," which can confuse some developers. This is outweighed by the benefits that the model provides.
To aid, here is an example trace output for a given graph trace:
And here is the test:
node().outputs('first-name').with(function () {
return delayed(20, function () {
return 'David'
})
}).build()
node().outputs('last-name').with(function () {
return delayed(200, function () {
return 'Byttow'
})
}).build()
node().given('first-name', 'last-name').outputs('full-name')
.with(function (firstName, lastName) {
return firstName.get() + ' ' + lastName.get()
}).build()
node().given('full-name').outputs('lowercased').with(function (fullName) {
return fullName.get().toLowerCase()
}).build()
node().given('lowercased').outputs('underscored').with(function (name) {
return delayed(100, function () {
return name.get().replace(' ', '_')
})
}).notCacheable().build()
node().given('lowercased').outputs('uppercased').with(function (name) {
return delayed(200, function () {
return name.get().toUpperCase()
})
}).notCacheable().build()
node().given('lowercased', 'uppercased', 'underscored').outputs('names').with(
function (lowercased, uppercased, underscored) {
return {
lowercased: lowercased.get()
, uppercased: uppercased.get()
, underscored: underscored.get()
}
}).build()
scope.createGraph('names').then(function (names) {
test.equals(names.lowercased, 'david byttow')
test.equals(names.uppercased, 'DAVID BYTTOW')
test.equals(names.underscored, 'david_byttow')
test.done()
}).end()
npm install composers
Or grab the source and
npm install
npm install nodeunit -g
nodeunit tests/composers_test.js
Composers requires the Q framework for promises.
Composers work by dynamically constructing and evaluating a dependency tree from a global registry of nodes. Each node defines a one single output key and an optional list of input keys. From this simple definition, we can build a dependency sub-tree from any given node by key.
Each node is invoked when all inputs are ready, therefore input values never block and are always present. This greatly simplifies handler method implementations.
Node handler methods must do one of three things:
- Return a value
- Return a future value (using the promises in Q library)
- Throw an error
Returning a value is straightforward and will be cached in the given scope, unless otherwise specified. This is common for values that are readability available, such as query parameters for the current request.
Returning a future value will be taken care of by the composer framework and only resolve the input once the future has been delivered. This is the most common result type for RPC-based handlers, that are waiting on a remote output. See more in the Futures section of this document.
Throwing an error will be caught by the framework and sent down as an input, only to be thrown and handled when it requested. More on that in the Error Handling section below.
When a graph is requested and invoked, a dependency tree of inputs is built. Nodes that are predefined and declare an output are also considered implicit inputs for any given graph. However, it is common that some subgraphs will require an explicit input. For example, the id of the current user, which is known when the request is dispatched.
Whether implicit or explicit, for a graph to be valid, all inputs in the transitive closure must be available. Explicit inputs are defined via give
on the Graph object. For example:
reg.defineNode().outputs('A').with(function () {
return 'A'
}).build()
reg.defineNode().given('A', 'B').outputs('A+B').with(function (a, b) {
return a.get() + b.get()
}).build()
var promise = scope.createGraph('A+B')
.give('B', 'FOO')
.start()
promise.then(function (ab) {
console.log(ab.get()) // outputs 'AFOO'
})
If B was never bound, then a Key B not bound
error would be thrown immediately.
Scopes are essentially the execution context of any graph evaluation. The most common scope is the request scope, where for any new request, all nodes are re-computed and cached whenever possible for the lifetime of that request.
Scopes provide the API for creating graphs, and thus, evaluating future values.
Values are retrieved from a graphs. That is, one single value per graph instance. And graphs can only be executed once per instance.
Graphs, when started, will return a future for the output result. Unlike composer methods, graph future callbacks use the standard Node mechanism of returning an (error, value) signature.
Graphs always evaluate from the leaf nodes (dependencies) on up.
Promises are objects that will eventually contain a value in the future. The most common operation performed on a future is to attach a callback that is later executed when the value is ready.
Futures are nice, but can become clumsy or prone-to-error when the client must transform future values, handle future result errors, or chain/join multiple futures together to produce a single value. Composers mask all of this by managing futures for the developer.
Typically, services will return a future result as the request is asynchronous, so returning the future to the composer framework is extremely useful. Word of warning: Although it's perfectly reasonable to transform a future and return the future in a single node, you lose the extra benefits to breaking it up into multiple nodes.
For example:
reg.defineNode().outputs('file').with(function () {
var promise = fileService.getFiles()
return promise.then(function (files) {
return transformFiles(files)
})
}).build()
That works, but the inner transformation is completely hidden from the composer system, which means it's never logged or measured in any meaningful way. The way to do this with composers is:
reg.defineNode().outputs('raw-files').with(function () {
return fileService.getFiles()
}).build()
reg.defineNode().given('raw-files').outputs('files').with(function (files) {
return transformFiles(files.get())
}).build()
Now, clients can request either raw-files or files, the code is simpler and everything is measured, logged and graphed.
Composers make exception handling easy by abstracting the need to handle exceptions except at the point the data is requested. That is why each input value is wrapped and accessed with a “get” method. Here is a simple example:
reg.defineNode().outputs('A').with(function () {
throw new Error('oops!')
}).build()
reg.defineNode().given('A').outputs('B').with(function (a) {
return a.get() + 1 // throws Error('oops!')
})
...
scope.createGraph('B').start().then(function (b) {
// never called
}, function(err) {
// err is equal to Error('oops!')
})
It's important to note that when an exception is thrown, it will be logged immediately after the node is evaluated, but the error will not propagate unless its value is used, for example:
scope.createGraph('B').start().then(function (b) {
// never called, exception suppressed
})
Caching occurs at the scope level for any given node, unless otherwise specified. For example:
reg.defineNode().outputs('cached').with(function () {
return Math.random()
}).build()
reg.defineNode().outputs('not-cached').with(function (a) {
return Math.random()
}).notCacheable().build()
reg.defineNode().given('cached').outputs('A').with(function (cached) {
return cached.get() // always returns the same result, A is also cached
}).build()
reg.defineNode().given('not-cached').outputs('B').with(function (notCached) {
return notCached.get() // returns unique values, B is also NOT cached
}).build()
This makes for good performance by default when requesting nodes multiple times that may result in latent calls.
The evaluation of all nodes is timed and logged accordingly. This gives us the granular flexibility of alerting and introspecting hotspots in our request flow at as granular a level as we'd like.
Additionally, any given request can output a visual graph of all nodes evaluated for the graph along with all timing values and hotspots using gnuplot (or something else).
Another great aspect of this abstraction pattern is callers are agnostic to how the output is derived. Notably, there becomes little need for explicit “services” and theoretically subgraphs can be broken off into separate servers. This gives us a great amount of flexibility when scaling our services.
Below is a list of the "exposed" interfaces.
Defines composer nodes to be used in the graph. Each node has the following properties:
- A single output key
- Zero or more input keys
- A handler function
- Options (cacheable, etc)
NodeBuilders can be created in one of two ways:
var builder = registry.defineNode()
Or
var builder = new NodeBuilder(registry)
It exposes the following API, which is chainable:
Declares a set of input keys as dependencies for this node. (Optional)
Sets the unique key that this node outputs. (Required)
Sets the handler that will be invoked when the node's inputs are ready and output should be produced. (Required)
Declares this node as NOT cacheable. (Optional)
Finalizes and adds the node into the scope's registry. (Required)
The registry is the container that keys all nodes by their output value. Generally, there is a single global registry that all nodes are registered into.
Registry's are meant to be constructed once, used to define nodes and passed into scopes.
var registry = new Registry()
Used to define a node, see the Node DSL above.
A scope is the realm in which graphs are evaluated and node output values are cached. Scopes must be enter
ed before nodes can be evaluated.
To create a scope, you must supply the registry that it will register nodes in and an optional parent scope.
var scope = new Scope(registry)
Creates a new graph for evaluation for a single output key. The scope must be entered when calling this. Any evaluated outputs (either explicit or implicit) will be cached in this scope while it is open.
Enters the current scope with an empty cache for output keys. An optional context may be passed to the scope, which is the context that all nodes will be evaluated in within this scope.
Exits the current scope and clears the cache of evaluated nodes.
Seeds the given value for the specified key in this scope. This is useful for inputs that are not bound to specific nodes and are scope-specific. For example, a request-based scope (a new scope for every inbound request) might seed the request and response objects to 'req' and 'resp' keys for nodes to input.
A graph represents the dependency tree in the global set of nodes used to obtain a single output value. To construct a graph, you must use an (active) scope.
// Create a graph for the output key named 'foobar'
var graph = scope.createGraph('foobar')
Once a graph is created, you may bind specific input keys required for the graph to evaluate with give
. For example, evaluating 'search-results' will probably require an input key named something like 'search-query', which is specific to that individual request. To do this, you must use give
.
Starts the graph for evaluation, can only be started once. Returns a promised result.
Adds a particular node into the graph for evaluation. Typically this is used when a given output key relies on an input that is not implicitly provided by another node. For example:
scope.createGraph('search-results').give('search-query', query).start()
.then(function (results) {
// Do something with the results.
})
Alternatively, sometimes it's useful to give the graph a future result. To do this, you can bind a future input with a promise, or the output of another graph (for chaining).
// Note, this is a contrived example. Typically user-data would simply depend on the user key.
var futureUser = scope.newGraph('user').give('user-id', userId).start()
scope.newGraph('user-data').give('user', futureUser).start().then(function (data) {
// Do something with user data.
})
As alluded to above, actual values are not passed into node handlers. Instead, a special Input
class with a single .get()
method is. This method, when called, will either return the value or throw an exception that the originating node had raised.
Note, this is only in beta stages and is expected to be developed significantly over a relatively short period of time. It's quite close to the metal right now in terms of API and we will learn about how best to take advantage of composers over time.
More importantly though, note that the core API may change, which is the most important reason why this is still a beta release.
In the mean time, please keep the feedback coming!
- Add per-node tracing and error reporting hooks
- Add graph output mode
Questions, comments, bug reports, and pull requests are all welcome. Submit them at the project on GitHub.
Bug reports that include steps-to-reproduce (including code) are the best. Even better, make them in the form of pull requests that update the test suite. Thanks!
David Byttow supported by The Obvious Corporation.
Copyright 2012 The Obvious Corporation.
Licensed under the Apache License, Version 2.0.
See the top-level file LICENSE.txt
and
(http://www.apache.org/licenses/LICENSE-2.0).