Skip to content

Latest commit

 

History

History
70 lines (52 loc) · 3.44 KB

props_11_16_23.md

File metadata and controls

70 lines (52 loc) · 3.44 KB

Joe and Mingwei Discussion 11/16/23

Definition ??: The output of operator delta is integrable if S -> delta -> fold<'static> = S -> fold<'static> .


Let's start with a simple case. fold vs. lattice_fold and 'tick vs. 'static.

Assume input order is fixed, but parenthesization of input is non-deterministic.

  • fold[f] with non-associative f produces a non-deterministic stream.

  • fold<'tick>[f] with associative f produces an integrable stream: fold<'tick>[f] -> fold<'static>[f] = fold<'static>[f]

  • fold<'static>[f] with associative f produces an integrable stream and a deterministic final result.

  • lattice_fold<'tick> is associative by definition, and produces a monotonic, integrable stream.

  • lattice_fold<'static> is associative by definition, and produces a monotonic integrable stream and a deterministic final result.

Note (downgrade fold on monotonic input): given a monotonic input stream, fold[f] with associative f produces the same output stream with static or tick semantics.


Definition: Given map function $m: T_1 \rightarrow T_2$, and fold functions $f:T \rightarrow T$ on $T_1$ and $T_2$, m is a morphism w.r.t. f iff i.e. f(m(x1), m(x2)) = m(f(x1, x2))

Observation (morphisms commute with fold): fold[f] -> map[m] = map[m] -> fold[f] if m a morphism wrt f.


Let's not confuse atomize and diff!

  • diff takes an input stream and produces a corresponding output stream of differences between adjacent pairs in the input.
  • atomize is kind of like flatten, where the types of input and output are the same (e.g. atomize would turn a stream of set<T> into a stream of singleton_set<T>, where flatten would turn a stream of set<T> into a stream of T).

I: [S][S + {x}][S + {x} + {s_1}][{}]... I': [S][{x}][{}][{}]... I' -> atomize: [{s1}][{s2}]...[{sn}][{x}][{x}]... lattice_merge(I) = lattice_merge(diff(I)) = lattice_merge(atomize(diff(I)))


// f associative
S -> diff -> fold<'static>[f] 
  = S -> atomize -> fold<'static>[f]
  = S -> fold<'static>[f]

// Sm monotonic, f associative
Sm -> fold<'static>[f]
  = Sm -> fold<'tick>[f]

Hydroflow, through the lens of DBSP

  • We have streams with ticks (arbitrary, non-overlapping, adjacent windows).
  • We have ops that work on either a static (running prefix of a stream) or per-tick basis.
  • Static ops + differentiation is the basic form of DBSP
    • our 'tick semantics should correspond to window functions
  • Let's assume we don't have the diff operator of DBSP in Hydroflow (for now.)

Back to Hydroflow.

Want to know, given an input stream, if network non-determinism will cause output non-determinism.

  1. Running result non-determinism (unavoidable with arbitrary parenthesization)
  2. Final result non-determinism is what we hope to achieve.
    • And possibly Conor's termination detection as well

Some other stream invariants like monotonicity or exclusivity could hold even in the face of non-determinism.


The state of things and possible changes

  • Cumul vs Delta streams and ops that are 'static vs 'tick need to be thought through
  • Do we need some kind of atomize op?
  • What is up with our delta ops? Compare to DBSP differentiation. What do we need (ignoring incremental maintenance optimizations) and what do our existing ops mean? Are they sound? Sufficient? Should we rename them?
  • Let's work the 7-tuple from the other doc through the ops.