22
33## Where to accumulate?
44
5- When building a derived collection, the central question is where
6- accumulation will happen: within derivation registers , or within a
7- materialized database ? Both approaches can produce equivalent results,
8- but they do it in very different ways.
5+ When building a derived collection, the central question is where accumulation
6+ will happen: within derivation state , or within an external database that you
7+ materialize into ? Both approaches can produce equivalent results, but they do it
8+ in very different ways.
99
10- ### Accumulate in the Database
10+ ### Accumulate in the external database
1111
12- To accumulate in the database, you'll define a collection having a reducible
13- schema with a derivation that uses only "publish" lambdas and no registers.
14- The Flow runtime uses the provided annotations to reduce new documents into
15- the collection, and ultimately keep the materialized table up to date.
12+ To accumulate in the external database, you'll define a collection having a
13+ reducible schema with a stateless derivation. The derivation can be written
14+ in either SQL or Typescript, but for these examples we use Typescript. The
15+ Flow runtime uses the provided annotations to reduce new documents into the
16+ collection, and ultimately keep the materialized table up to date.
1617
1718A key insight is that the database is the _ only_ stateful system in this
1819scenario, and that Flow is making use of reductions in two places:
@@ -45,42 +46,43 @@ When materializing into a pub/sub topic, there _is_ no store to hold final value
4546and Flow will publish delta states: each a partial update of the (unknown)
4647final value.
4748
48- ### Accumulate in Registers
49+ ### Accumulate in derivation state
4950
50- Accumulating in registers involves a derivation that defines a reducible
51- register schema, and uses "update" lambdas.
52- Registers are arbitrary documents that can be shared and updated by the various
53- transformations of a derivation. The Flow runtime allocates, manages, and scales
54- durable storage for registers; you don't have to.
51+ Accumulating in derivation state involves a ` sqlite ` derivation having one
52+ or more tables, which are created by ` migrations ` . These tables can be shared
53+ and updated by the various transforms of the derivation. The Flow runtime
54+ transactionally persists modifications to these tables.
5555
56- When using registers , the typical pattern is to use reduction annotations
57- within updates of the register , and to then publish last-write-wins "snapshots"
58- of the fully reduced value .
56+ When using a stateful derivation , the typical pattern is to use `INSERT ... ON
57+ CONFLICT ... ` to accumulate state in your tables , and then ` SELECT` from those
58+ tables to emit the documents .
5959
6060Returning to our summing example:
6161
62- | Time | Register | Lambdas | Derived Document |
63- | ---- | -------- | --- -------------------------------- | ---------------- |
64- | T0 | ** 0** | update(2, 1, 2), publish(register) | ** 5** |
65- | T1 | ** 5** | update(-2, 1), publish(register) | ** 4** |
66- | T2 | ** 4** | update(3, -2, 1), publish(register) | ** 6** |
67- | T3 | ** 6** | update() |
62+ | Time | sum table | Lambdas | Derived Document |
63+ | ---- | --------- | -------------------------------- | ---------------- |
64+ | T0 | ** 0** | update(2, 1, 2), select sum ... | ** 5** |
65+ | T1 | ** 5** | update(-2, 1), select sum ... | ** 4** |
66+ | T2 | ** 4** | update(3, -2, 1), select sum ... | ** 6** |
67+ | T3 | ** 6** | update() |
6868
69- Register derivations are a great solution for materializations into non-
69+ Stateful derivations are a great solution for materializations into non-
7070transactional stores, because the documents they produce can be applied
7171multiple times without breaking correctness.
7272
7373They're also well suited for materializations that publish into pub/sub,
7474as they can produce stand-alone updates of a fully-reduced value.
7575
76- ### Example: Summing in DB vs Register
76+ Additionally, stateful derivations are the best way to perform inner joins and time-windowed joins.
77+
78+ ### Example: Summing in a stateless vs a stateful derivation
7779
7880See [ summer.flow.yaml] ( summer.flow.yaml ) for a simple example
79- of summing counts in the database, vs in registers .
81+ of summing counts using both approaches .
8082
8183## Types of Joins
8284
83- ### Outer Join accumulated in Database
85+ ### Outer Join using a stateless derivation
8486
8587Example of an outer join, which is reduced within a target database table.
8688This join is "fully reactive": it updates with either source collection,
@@ -89,11 +91,11 @@ and reflects the complete accumulation of their documents on both sides.
8991The literal documents written to the collection are combined delta states,
9092reflecting changes on one or both sides of the join. These delta states
9193are then fully reduced into the database table, and no other storage _ but_
92- the table is required by this example .
94+ the table being materialized into is required .
9395
9496See [ join-outer-flow.yaml] ( join-outer-flow.yaml ) .
9597
96- ### Inner Join accumulated in Registers
98+ ### Inner Join using a stateful derivation
9799
98100Example of an inner join, which is reduced within the derivation's registers.
99101This join is also "fully reactive", updating with either source collection,
@@ -108,7 +110,7 @@ join are matched.
108110
109111See [ join-inner.flow.yaml] ( join-inner.flow.yaml ) .
110112
111- ### One-sided Join accumulated in Registers
113+ ### One-sided join using a stateful derivation
112114
113115Example of a one-sided join, which publishes a current LHS joined
114116with an accumulated RHS.
@@ -118,13 +120,3 @@ paired with a reduced snapshot of the RHS accumulator at that time.
118120
119121See [ join-one-sided.flow.yaml] ( join-one-sided.yaml ) .
120122
121- ### Comparing Registers
122-
123- Suppose we want to take action based on how a register is changing.
124-
125- For example, suppose we want to detect "zero crossings" of a running sum,
126- and then filter the source collection to those documents which caused the
127- sum to cross from positive to negative (or vice versa).
128-
129- We can use the ` previous ` register value to do so.
130- See [ zero-crossing.flow.yaml] ( zero-crossing.flow.yaml ) .
0 commit comments