Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Construct the overall assembly theorem
This brings us roughly to parity with the old MM1 `assemble` tactic, which has been moved to `assembler-old.mm1`; it will probably be deleted once everything is moved over and the new assembler works. This part is largely bookkeeping - we have assembled all the procedures and need to append them together to prove that the full thing assembles, and then we finally prove the `end e. u64` side condition that was deferred during the assembly process to turn `assemble` into `assembled`. The last stage looks trivial but is actually somewhat expensive in the MM0 model of computation by verification, and it makes an interesting asymptotic analysis problem. We have the theorems theorem assembled_l: $ assembled ctx (A +asm B) $ > $ assembled ctx A $; theorem assembled_r: $ assembled ctx (A +asm B) $ > $ assembled ctx B $; which can be thought of as a fancy version of the rule `(p -> a /\ b) |- (p -> a)`, and we want to use it to decompose a large conjunction `p -> a /\ ... /\ z` with, let's say, `n` conjuncts, into `n` theorems asserting each conjunct. The trouble is that MM0 requires that each theorem stand alone in terms of term formation, which means that even if we have been good and the conjunction is a balanced binary tree so that each individual proof involves only `log n` steps, we still need to construct the term `p -> a /\ ... /\ z` which is O(n) in each conjunct extraction theorem, leading to O(n^2) work. We can improve on this by adding lemmas. Suppose that `n = k * m`, and we separate the conjunctions into `k` groups of size `m`. So we prove the lemmas p -> a[0] /\ ... /\ a[m-1], ... p -> a[(k-1)*m] /\ ... /\ a[(k-1)*m + m-1] and then decompose the pieces further into conjuncts. Ignoring lower order terms, each lemma can be proved in O(k*m) (because we have to state the full conjunct in the first step), so the total cost of all the lemmas is O(k^2 m) = O(k n). Adding in the cost of proving the conjuncts results in the following recurrence: T(k * m) <= k T(m) + O(k^2 * m) If we solve the groups with the naive method so that T(m) <= m^2, we get T(k*m) <= O(k m^2 + k^2 m) and the best choice is to take k = sqrt n, for a bound of O(n^(3/2)). But there is no reason to stop at just one subdivision: if we take a branching factor of `k` and apply this method recursively, we get: T(k^(i+1)) <= k T(k^i) + O(k^(i+2)) => T(k^i) = O(i * k^(i+1)) => T(n) = O(n * log n * (k / log k)) So we get O(n log n) time if we pick any constant branching factor, with the optimal choice being k = e. Well, that's annoying, but k = 2 (make a lemma at every node) is easy to implement and only 6% worse. I believe this to be optimal within the current constraints of the language, but with language changes this could be brought down to linear time. I considered this for the MMB format, with a `theorems` command that would allow you to prove multiple theorems in one go, reusing the term pool for all of them. In this case you could use it to prove all n theorems in O(n) steps and each theorem is constant size, and the term pool used in the proof is also O(n), so the overall cost would be O(n). In the end it turned out to be a lot of complication to ask of verifiers, and a log n overhead isn't *that* bad, especially considering that the MM0 model of computation by term construction already imposes a log n overhead over random access because term constructors have a constant branching factor.
- Loading branch information