From 56614d54f639a3e9be0528d7a1ccd924c44ca950 Mon Sep 17 00:00:00 2001 From: Brian W Bush Date: Mon, 23 Dec 2024 12:06:50 -0700 Subject: [PATCH 1/7] Wrote tech-report section of sortition of IBs and EBs. --- docs/technical-report-1.md | 108 +++++++++++++++++++++++++++++++++ images/prob-no-eb.svg | 77 ++++++++++++++++++++++++ images/prob-no-ib.svg | 118 +++++++++++++++++++++++++++++++++++++ images/splitting-stake.svg | 111 ++++++++++++++++++++++++++++++++++ 4 files changed, 414 insertions(+) create mode 100644 images/prob-no-eb.svg create mode 100644 images/prob-no-ib.svg create mode 100644 images/splitting-stake.svg diff --git a/docs/technical-report-1.md b/docs/technical-report-1.md index 3a3c0342..8da95191 100644 --- a/docs/technical-report-1.md +++ b/docs/technical-report-1.md @@ -6,12 +6,20 @@ ## Introduction +> [!NOTE] +> +> Consider discussing the roadmap and/or methodology of the Leios innovation stream here. + ## Informal description of Short Leios ### Glossary +> [!NOTE] +> +> Should the glossary be moved to the appendix? + - Input block (IB) - Endoser block (EB) - Ranking block (RB) @@ -53,6 +61,106 @@ ![Component diagram of Haskell prototype](../simulation/docs/component.png) +## Sortition + +In Leios stake-based sortition occurs for the selection of the producers of IBs, EBs, and votes. The selection of the producers of IBs and EBs occurs similarly to Praos and the selection of the votes occurs similarly to Mithril. We define two functions for the probability of being elected as a producer or vote in the Bernoulli trials for each unit of stake (i.e., one lottery for each lovelace). Consider the situation where there is a total stake $S$ and a Leios node has $s$ stake delegated to it. Let $f^\prime$ be the probability that a single lovelace wins the lottery and $f$ be the probability that any of the $S$ lovelace win it: + +$$ +f = 1 - (1 - f^\prime)^S +$$ + +The probability that the node with $s$ stake delegated to it wins any of the lotteries is + +$$ +\psi_{f^\prime}(s) = 1 - (1 - f^\prime)^s +$$ +which can be simplified with the relative stake $\sigma = s / S$ to + +$$ +\phi_f(\sigma) = \psi_{f^\prime}(\sigma \cdot S) = 1 - (1 - f)^\sigma +$$ + +### Advantage gained by splitting stake among nodes + +Note that this formulation encourages the splitting of stake among many nodes: by pooling stake at a single node the stake has only one chance to build an RB, IB, or EB, but by splitting the stake among many nodes there is a chance for multiple RBs, IBs, or EBs in the same lottery. We can quantify this effect by comparing the expected number of lottery wins when splitting $\sigma$ into two $\frac{1}{2}\sigma$ pieces. The expected number of lottery wins for split stake minus combined stake is always positive for $\sigma > 0$. + +$$ +\delta = 1 \cdot \phi_f\left(\frac{\sigma}{2}\right) + 1 \cdot \phi_f\left(\frac{\sigma}{2}\right) - 1 \cdot \phi_f(\sigma) = 2 \left[ 1 - (1 - f)^{\sigma/2} \right] - \left[ 1 - (1 - f)^\sigma \right] = \left[ 1 - (1 - f)^{\sigma/2} \right]^2 = \left[ \phi_f\left(\frac{\sigma}{2}\right) \right]^2 > 0 +$$ + +In relative terms, $\frac{\delta}{\phi_f(\sigma)} \approx \frac{\phi_f(\sigma)}{4}$ , so the advantage is small unless the node starts with an appreciable probability of producing a block. A Taylor-series expansion reveals $2\phi_f(\sigma/2) = \phi_g(\sigma)$ with $g = f \cdot (1 + \frac{1}{4}f \cdot \sigma) + \mathcal{O}(f^3)$, implying that the benefit of splitting stake in half is equivalent to the protocol parameter for the lottery being fractionally increased by $f \cdot \sigma / 4$. In the limit of splitting the stake into a large number of parts, we have the limit + +$$ +\lim_{k \rightarrow \infty} k \cdot \phi_f\left( \frac{\sigma}{k} \right) = \phi_h(\sigma) +$$ + +where $h = f \cdot ( 1 + \frac{1}{2} f \cdot \sigma) + \mathcal{O}(f^3)$, which implies a fractional benefit limited to $f \cdot \sigma / 2$. + +> [!NOTE] +> +> - [ ] Do we need to include a proof of $h = f \cdot ( 1 + \frac{1}{2} f \cdot \sigma) + \mathcal{O}(f^3)$, perhaps in an appendix? +> - [ ] Should we instead include the exact result, which is really messy? + +The following plot shows this effect on splitting stake. The horizontal axis represents the production rate $f$, which would be per-slot for RBs or IBs and per-period for EBs. The vertical axis shows the fractional benefit of the splitting, $f / (h - f)$. Even a controller of 45% of the stake would only benefit with less than a 20% advantage in block production, even at a high production rate such as a 75% chance per period, if they split their stake minutely. This small advantage would be outweighed by the cost of the computing hardware and bandwidth required to deploy the large number of node having very little stake delegated to them. + +![Effective boost in production from splitting stake](../images/splitting-stake.svg) + + +### Input blocks + +In Leios, even if a node wins the IB lottery several times, it is only allowed to build a single block. This is identical to the sortition rule in Praos. Let $f_\text{IB}$ be the protocol parameter specifying the per-slot probability of a node producing an IB: + +$$ +p_\text{IB} = \phi_{f_\text{IB}}(\sigma) = 1 - (1 - f_\text{IB})^\sigma +$$ + +Given the IB-production phase of length $L$ in each Leios pipeline, we want to avoid the situation where an unlucky lottery results in no IBs being produced during that phase. The probability of none of the $i$ nodes with stake $\sigma_i$ winning the lottery in a slot is + +$$ +\prod_i \left( 1 - \phi_{f_\text{IB}}(\sigma_i) \right) = \prod_i (1 - f_\text{IB})^{\sigma_i} = (1 - f_\text{IB})^{\sum_i \sigma_i} = 1 - f_\text{IB} +$$ + +where $\sum_i \sigma_i = 1$ because the individual stakes sum to the total stake, $S \equiv \sum_i s_i$. Hence the probability of no nodes winning the lottery in any slot of the phase is + +$$ +q_\text{IB} = (1 - f_\text{IB})^L +$$ + +and the expected number of IBs in the phase is + +$$ +n_\text{IB} = f_\text{IB} \cdot L +$$ + +The figure below illustrates the relationship between $f_\text{IB}$ and $q_\text{IB}$. Understandably, short phases can result in appreciable probabilities of no having an input block in the pipeline. + +![Probability of no IB in a phase](../images/prob-no-ib.svg) + + +### Endorser blocks + +The sortition for EBs occurs per phase, not per slot. The previous section's analysis for input blocks holds with minor modification: The probability that a node with stake $\sigma$ wins the EB lottery for a pipeline is + +$$ +p_\text{EB} = \phi_{f_\text{EB}}(\sigma) = 1 - (1 - f_\text{EB})^\sigma +$$ + +and the probability that no EB is produced by any node is + +$$ +q_\text{E} = 1 - f_\text{EB} +$$ + +In Short Leios it is critically important that at least one EB be produced in the pipeline because, otherwise, the pipeline's IBs will not be referenced in the RB and the work done creating them will be lost and their transactions will have to wait for another IB. + +![Probability of no EB in phase](../images/prob-no-eb.svg) + +Short Leios also relies on the EB being included in an RB before another EB is produced: endorser blocks are not allowed to queue awaiting RBs or to reference other EBs. Thus, the EB rate should be consistent with the RB rate. + + +### Votes + + ## Voting and certificates Leios voting involves numerous decisions about cryptographic matters and protocol parameters. diff --git a/images/prob-no-eb.svg b/images/prob-no-eb.svg new file mode 100644 index 00000000..b6d6d021 --- /dev/null +++ b/images/prob-no-eb.svg @@ -0,0 +1,77 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +0 +25 +50 +75 +100 + + + + + + + + + + +0 +25 +50 +75 +100 +Per-phase probability of an EB [%] +Per-phase probability of no EB [%] + + diff --git a/images/prob-no-ib.svg b/images/prob-no-ib.svg new file mode 100644 index 00000000..0cd27c3c --- /dev/null +++ b/images/prob-no-ib.svg @@ -0,0 +1,118 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +0 +25 +50 +75 +100 + + + + + + + + + + +0 +25 +50 +75 +100 +Per-slot probability of an IB [%] +Per-phase probability of no IB [%] + +Phase length [slot] + + + + + + + + + + + + + + + + + + + + +5 +10 +15 +20 +25 +30 +35 +40 +45 +50 + + diff --git a/images/splitting-stake.svg b/images/splitting-stake.svg new file mode 100644 index 00000000..fe5086ba --- /dev/null +++ b/images/splitting-stake.svg @@ -0,0 +1,111 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +0 +5 +10 +15 + + + + + + + + + +0 +25 +50 +75 +100 +Production rate [%/period] +Effective boost in production rate from finely splitting stake [%] + +Share of stake [%] + + + + + + + + + + + + + + + + + + +5 +10 +15 +20 +25 +30 +35 +40 +45 + + From cbc1c224f2282c33ad72b8145e12072cbe10abdb Mon Sep 17 00:00:00 2001 From: Brian W Bush Date: Mon, 23 Dec 2024 15:09:07 -0700 Subject: [PATCH 2/7] Exact result for effective boost. --- docs/technical-report-1.md | 34 +++++++++++++++++++++++++++++++--- images/splitting-stake.svg | 18 +++++++++--------- 2 files changed, 40 insertions(+), 12 deletions(-) diff --git a/docs/technical-report-1.md b/docs/technical-report-1.md index 8da95191..4d3528e0 100644 --- a/docs/technical-report-1.md +++ b/docs/technical-report-1.md @@ -94,12 +94,12 @@ $$ \lim_{k \rightarrow \infty} k \cdot \phi_f\left( \frac{\sigma}{k} \right) = \phi_h(\sigma) $$ -where $h = f \cdot ( 1 + \frac{1}{2} f \cdot \sigma) + \mathcal{O}(f^3)$, which implies a fractional benefit limited to $f \cdot \sigma / 2$. +where $h = 1 - \sqrt[\sigma]{1 + \sigma \cdot \log (1 - f)} = f \cdot ( 1 + \frac{1}{2} f \cdot \sigma) + \mathcal{O}(f^3)$, which implies a fractional benefit limited to $f \cdot \sigma / 2$. > [!NOTE] > -> - [ ] Do we need to include a proof of $h = f \cdot ( 1 + \frac{1}{2} f \cdot \sigma) + \mathcal{O}(f^3)$, perhaps in an appendix? -> - [ ] Should we instead include the exact result, which is really messy? +> - [ ] Do we need to include the derivation of $h = 1 - \sqrt[\sigma]{1 + \sigma \cdot \log (1 - f)} = f \cdot ( 1 + \frac{1}{2} f \cdot \sigma) + \mathcal{O}(f^3)$? +> - [ ] . . . perhaps in an appendix? The following plot shows this effect on splitting stake. The horizontal axis represents the production rate $f$, which would be per-slot for RBs or IBs and per-period for EBs. The vertical axis shows the fractional benefit of the splitting, $f / (h - f)$. Even a controller of 45% of the stake would only benefit with less than a 20% advantage in block production, even at a high production rate such as a 75% chance per period, if they split their stake minutely. This small advantage would be outweighed by the cost of the computing hardware and bandwidth required to deploy the large number of node having very little stake delegated to them. @@ -161,6 +161,34 @@ Short Leios also relies on the EB being included in an RB before another EB is p ### Votes +#### Leader-election like voting[​](https://peras.cardano-scaling.org/docs/reports/tech-report-2#leader-election-like-voting "Direct link to Leader-election like voting") + +The first algorithm is basically identical to the one used for [Mithril](https://mithril.network/) signatures, and is also the one envisioned for [Leios](https://leios.cardano-scaling.org/) (see Appendix D of the recent Leios paper). It is based on the following principles: + +- The goal of the algorithm is to produce a number of votes targeting a certain threshold such that each voter receives a number of vote proportionate to σσ, their fraction of total stake, according to the basic probability function ϕ(σ)=1−(1−f)σϕ(σ)=1−(1−f)σ, +- There are various parameters to the algorithm: + - ff is the fraction of slots that are "active" for voting + - mm is the number of _lotteries_ each voter should try to get a vote for, + - kk is the target total number of votes for each round (e.g., quorum) kk should be chosen such that k=m⋅ϕ(0.5)k=m⋅ϕ(0.5) to reach a majority quorum, +- When its turn to vote comes, each node run iterates over an index i∈[1…m]i∈[1…m], computes a hash from the _nonce_ and the index ii, and compares this hash with f(σ)f(σ): if it is lower than or equal, then the node has one vote. + - Note the computation f(σ)f(σ) is exactly identical to the one used for [leader election](https://github.com/intersectmbo/cardano-ledger/blob/f0d71456e5df5a05a29dc7c0ac9dd3d61819edc8/libs/cardano-protocol-tpraos/src/Cardano/Protocol/TPraos/BHeader.hs#L434). + +We [prototyped](https://github.com/input-output-hk/peras-design/blob/73eabecd272c703f1e1ed0be7eeb437d937e1179/peras-vote/src/Peras/Voting/Vote.hs#L311) this approach in Haskell. + +#### Sortition-like voting[​](https://peras.cardano-scaling.org/docs/reports/tech-report-2#sortition-like-voting "Direct link to Sortition-like voting") + +The second algorithm is based on the _sortition_ process initially invented by [Algorand](https://web.archive.org/web/20170728124435id_/https://people.csail.mit.edu/nickolai/papers/gilad-algorand-eprint.pdf) and [implemented](https://github.com/algorand/sortition/blob/main/sortition.cpp) in their node. It is based on the same idea, namely that a node should have a number of votes proportional to their fraction of total stake, given a target "committee size" expressed as a fraction of total stake pp. And it uses the fact the number of votes a single node should get based on these parameters follows a binomial distribution. + +The process for voting is thus: + +- Compute the individual probability of each "coin" to win a single vote pp as the ratio of expected committee size over total stake. +- Compute the binomial distribution B(n,p)B(n,p) where nn is the node's stake. +- Compute a random number between 0 and 1 using _nonce_ as the denominator over maximum possible value (e.g., all bits set to 1) for the nonce as denominator. +- Use [bisection method](https://en.wikipedia.org/wiki/Bisection_method) to find the value corresponding to this probability in the CDF for the aforementioned distribution. + +This yields a vote with some _weight_ attached to it "randomly" computed so that the overall sum of weights should be around expected committee size. + + ## Voting and certificates Leios voting involves numerous decisions about cryptographic matters and protocol parameters. diff --git a/images/splitting-stake.svg b/images/splitting-stake.svg index fe5086ba..73563280 100644 --- a/images/splitting-stake.svg +++ b/images/splitting-stake.svg @@ -47,15 +47,15 @@ - - - - - - - - - + + + + + + + + + 0 From ded2f73e3983169b7a799a3ab8a1bc22089a8f1c Mon Sep 17 00:00:00 2001 From: Brian W Bush Date: Tue, 24 Dec 2024 05:51:28 -0700 Subject: [PATCH 3/7] Added text on VRF and selection. --- docs/technical-report-1.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/docs/technical-report-1.md b/docs/technical-report-1.md index 4d3528e0..ccb06245 100644 --- a/docs/technical-report-1.md +++ b/docs/technical-report-1.md @@ -80,6 +80,9 @@ $$ \phi_f(\sigma) = \psi_{f^\prime}(\sigma \cdot S) = 1 - (1 - f)^\sigma $$ +Sortition reuses the Praos VRF, but prepending a string such as `Leios-IB`, `Leios-EB`, or `Leios-Vote` to the bytes being hashed, in order to remove correlations between the uses of the VRF. The resulting VRF value `v` is compared to the probability (in the case of IBs and EBs) or the cumulative probability (in the case of votes) for winning the sortition lottery. This is equivalent to the eligibility checks in the Leios paper. + + ### Advantage gained by splitting stake among nodes Note that this formulation encourages the splitting of stake among many nodes: by pooling stake at a single node the stake has only one chance to build an RB, IB, or EB, but by splitting the stake among many nodes there is a chance for multiple RBs, IBs, or EBs in the same lottery. We can quantify this effect by comparing the expected number of lottery wins when splitting $\sigma$ into two $\frac{1}{2}\sigma$ pieces. The expected number of lottery wins for split stake minus combined stake is always positive for $\sigma > 0$. @@ -114,6 +117,8 @@ $$ p_\text{IB} = \phi_{f_\text{IB}}(\sigma) = 1 - (1 - f_\text{IB})^\sigma $$ +If $v_\text{IB} \in [0,1]$ is the node's IB VRF value for the current slot, then the node's eligibility condition is $v_\text{IB} \leq p_\text{IB}$. + Given the IB-production phase of length $L$ in each Leios pipeline, we want to avoid the situation where an unlucky lottery results in no IBs being produced during that phase. The probability of none of the $i$ nodes with stake $\sigma_i$ winning the lottery in a slot is $$ @@ -151,6 +156,8 @@ $$ q_\text{E} = 1 - f_\text{EB} $$ +If $v_\text{EB} \in [0,1]$ is the node's EB VRF value for the current phase, then the node's eligibility condition is $v_\text{EB} \leq p_\text{EB}$. + In Short Leios it is critically important that at least one EB be produced in the pipeline because, otherwise, the pipeline's IBs will not be referenced in the RB and the work done creating them will be lost and their transactions will have to wait for another IB. ![Probability of no EB in phase](../images/prob-no-eb.svg) From 4437ed8f211de2ab8646749b87acea6b195ab654 Mon Sep 17 00:00:00 2001 From: Brian W Bush Date: Wed, 25 Dec 2024 05:37:52 -0700 Subject: [PATCH 4/7] Text and plots for honest/adversarial EB production. --- docs/technical-report-1.md | 21 ++- images/prob-no-eb-1000.svg | 297 +++++++++++++++++++++++++++++++++++ images/prob-no-eb-2500.svg | 297 +++++++++++++++++++++++++++++++++++ images/prob-no-eb.svg | 312 +++++++++++++++++++++++++++++++------ 4 files changed, 874 insertions(+), 53 deletions(-) create mode 100644 images/prob-no-eb-1000.svg create mode 100644 images/prob-no-eb-2500.svg diff --git a/docs/technical-report-1.md b/docs/technical-report-1.md index ccb06245..7dde7fb8 100644 --- a/docs/technical-report-1.md +++ b/docs/technical-report-1.md @@ -91,7 +91,7 @@ $$ \delta = 1 \cdot \phi_f\left(\frac{\sigma}{2}\right) + 1 \cdot \phi_f\left(\frac{\sigma}{2}\right) - 1 \cdot \phi_f(\sigma) = 2 \left[ 1 - (1 - f)^{\sigma/2} \right] - \left[ 1 - (1 - f)^\sigma \right] = \left[ 1 - (1 - f)^{\sigma/2} \right]^2 = \left[ \phi_f\left(\frac{\sigma}{2}\right) \right]^2 > 0 $$ -In relative terms, $\frac{\delta}{\phi_f(\sigma)} \approx \frac{\phi_f(\sigma)}{4}$ , so the advantage is small unless the node starts with an appreciable probability of producing a block. A Taylor-series expansion reveals $2\phi_f(\sigma/2) = \phi_g(\sigma)$ with $g = f \cdot (1 + \frac{1}{4}f \cdot \sigma) + \mathcal{O}(f^3)$, implying that the benefit of splitting stake in half is equivalent to the protocol parameter for the lottery being fractionally increased by $f \cdot \sigma / 4$. In the limit of splitting the stake into a large number of parts, we have the limit +In relative terms, $\frac{\delta}{\phi_f(\sigma)} \approx \frac{\phi_f(\sigma)}{4}$ , so the advantage is small unless the node starts with an appreciable probability of producing a block. A Taylor-series expansion reveals $2\phi_f(\sigma/2) = \phi_g(\sigma)$ with $g = f \cdot (1 + \frac{1}{4}f \cdot \sigma) + \mathcal{O}(f^3)$, implying that the benefit of splitting stake in half is equivalent to the protocol parameter for the lottery being fractionally increased by $f \cdot \sigma / 4$. This renormalization can be extended to the limit of splitting the stake into a large number of parts, where we have a simple result, $$ \lim_{k \rightarrow \infty} k \cdot \phi_f\left( \frac{\sigma}{k} \right) = \phi_h(\sigma) @@ -150,18 +150,25 @@ $$ p_\text{EB} = \phi_{f_\text{EB}}(\sigma) = 1 - (1 - f_\text{EB})^\sigma $$ -and the probability that no EB is produced by any node is +If $v_\text{EB} \in [0,1]$ is the node's EB VRF value for the current phase, then the node's eligibility condition is $v_\text{EB} \leq p_\text{EB}$. -$$ -q_\text{E} = 1 - f_\text{EB} -$$ +In Short Leios it is critically important that at least one EB be produced in the pipeline because, otherwise, the pipeline's IBs will not be referenced in the RB and the work done creating them will be lost and their transactions will have to wait for another IB. There are four situations of interest for EB production in a particular pipeline: -If $v_\text{EB} \in [0,1]$ is the node's EB VRF value for the current phase, then the node's eligibility condition is $v_\text{EB} \leq p_\text{EB}$. +1. No EB is produced: $(1 - p_\text{hon}) \cdot (1 - p_\text{adv}) = 1 - f_\text{EB}$. +2. Only honest parties produce EBs: $p_\text{hon} \cdot (1 - p_\text{adv})$. +3. Only adversarial parties produce EBs: $(1 - p_\text{hon}) \cdot p_\text{adv}$. +4. Both honest and adversarial parties produce EBs: $p_\text{hon} \cdot p_\text{adv}$. -In Short Leios it is critically important that at least one EB be produced in the pipeline because, otherwise, the pipeline's IBs will not be referenced in the RB and the work done creating them will be lost and their transactions will have to wait for another IB. +where $p_\text{hon} = \phi_{f_\text{EB}}(\sigma_\text{hon})$, $p_\text{adv} = \phi_{f_\text{EB}}(\sigma_\text{adv})$, and $\sigma_\text{hon} + \sigma_\text{adv} = 1$. For this analysis we ignore the splitting of stake because the analysis of it in a previous section indicated that it is inconsequential even in the worst case. ![Probability of no EB in phase](../images/prob-no-eb.svg) +If we account for stake being divided among many nodes, we get a nearly identical result. The left plot below shows the scenario where stake is divided evenly among 1000 nodes and the right plot shows when it is divided among 2500 nodes according to a non-uniform stake distribution similar to that of epoch 500. + +| Uniform stake among 1000 nodes | Realistic non-uniform stake among 2500 nodes | +| ------------------------------------------------------------------ | ---------------------------------------------------------------------- | +| ![EB production with uniform stake](../images/prob-no-eb-1000.svg) | ![EB production with non-uniform stake](../images/prob-no-eb-2500.svg) | + Short Leios also relies on the EB being included in an RB before another EB is produced: endorser blocks are not allowed to queue awaiting RBs or to reference other EBs. Thus, the EB rate should be consistent with the RB rate. diff --git a/images/prob-no-eb-1000.svg b/images/prob-no-eb-1000.svg new file mode 100644 index 00000000..64c3bc82 --- /dev/null +++ b/images/prob-no-eb-1000.svg @@ -0,0 +1,297 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Adversarial EBs only + + + + + + + + + + +Honest and adversarial EBs + + + + + + + + + + +No EB + + + + + + + + + + +Honest EBs only + + + + + + + +0 +25 +50 +75 +100 + + + + + +0 +25 +50 +75 +100 +0 +25 +50 +75 +100 + + + + + +0 +25 +50 +75 +100 + + + + + +Production rate [%/phase] +Probability [%] + +Adversarial +stake [%] + + + + + + + + + + + + + + + + + + + + +0 +5 +10 +15 +20 +25 +30 +35 +40 +45 + + diff --git a/images/prob-no-eb-2500.svg b/images/prob-no-eb-2500.svg new file mode 100644 index 00000000..59bc597a --- /dev/null +++ b/images/prob-no-eb-2500.svg @@ -0,0 +1,297 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Adversarial EBs only + + + + + + + + + + +Honest and adversarial EBs + + + + + + + + + + +No EB + + + + + + + + + + +Honest EBs only + + + + + + + +0 +25 +50 +75 +100 + + + + + +0 +25 +50 +75 +100 +0 +25 +50 +75 +100 + + + + + +0 +25 +50 +75 +100 + + + + + +Production rate [%/phase] +Probability [%] + +Adversarial +stake [%] + + + + + + + + + + + + + + + + + + + + +0 +5 +10 +15 +20 +25 +30 +35 +40 +45 + + diff --git a/images/prob-no-eb.svg b/images/prob-no-eb.svg index b6d6d021..0103e73d 100644 --- a/images/prob-no-eb.svg +++ b/images/prob-no-eb.svg @@ -21,57 +21,277 @@ - + - - + + - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + -0 -25 -50 -75 -100 - - - - - - - - - - -0 -25 -50 -75 -100 -Per-phase probability of an EB [%] -Per-phase probability of no EB [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Adversarial EBs only + + + + + + + + + + +Honest and adversarial EBs + + + + + + + + + + +No EB + + + + + + + + + + +Honest EBs only + + + + + + + +0 +25 +50 +75 +100 + + + + + +0 +25 +50 +75 +100 +0 +25 +50 +75 +100 + + + + + +0 +25 +50 +75 +100 + + + + + +Production rate [%/phase] +Probability [%] + +Adversarial +stake [%] + + + + + + + + + + + + + + + + + + + + +0 +5 +10 +15 +20 +25 +30 +35 +40 +45 From b9e69682c9e5921c7407e786a540941ee0534eec Mon Sep 17 00:00:00 2001 From: Brian W Bush Date: Wed, 25 Dec 2024 08:25:13 -0700 Subject: [PATCH 5/7] Added analysis of RB prior to next EB. --- docs/technical-report-1.md | 39 +++++++- images/leios-eb-rb.svg | 178 +++++++++++++++++++++++++++++++++++++ 2 files changed, 215 insertions(+), 2 deletions(-) create mode 100644 images/leios-eb-rb.svg diff --git a/docs/technical-report-1.md b/docs/technical-report-1.md index 7dde7fb8..867738c5 100644 --- a/docs/technical-report-1.md +++ b/docs/technical-report-1.md @@ -143,7 +143,7 @@ The figure below illustrates the relationship between $f_\text{IB}$ and $q_\text ### Endorser blocks - + The sortition for EBs occurs per phase, not per slot. The previous section's analysis for input blocks holds with minor modification: The probability that a node with stake $\sigma$ wins the EB lottery for a pipeline is $$ @@ -152,6 +152,10 @@ $$ If $v_\text{EB} \in [0,1]$ is the node's EB VRF value for the current phase, then the node's eligibility condition is $v_\text{EB} \leq p_\text{EB}$. +> [!IMPORTANT] +> +> This formulation makes it likely (probability = 1 - f) that some pipelines will have no EBs. Should there be two or more lotteries instead of just one? Or do we just need to accept this as part of Short Leios? + In Short Leios it is critically important that at least one EB be produced in the pipeline because, otherwise, the pipeline's IBs will not be referenced in the RB and the work done creating them will be lost and their transactions will have to wait for another IB. There are four situations of interest for EB production in a particular pipeline: 1. No EB is produced: $(1 - p_\text{hon}) \cdot (1 - p_\text{adv}) = 1 - f_\text{EB}$. @@ -169,7 +173,38 @@ If we account for stake being divided among many nodes, we get a nearly identica | ------------------------------------------------------------------ | ---------------------------------------------------------------------- | | ![EB production with uniform stake](../images/prob-no-eb-1000.svg) | ![EB production with non-uniform stake](../images/prob-no-eb-2500.svg) | -Short Leios also relies on the EB being included in an RB before another EB is produced: endorser blocks are not allowed to queue awaiting RBs or to reference other EBs. Thus, the EB rate should be consistent with the RB rate. +> [!CAUTION] +> +> - [ ] Is there a theoretical argument (like the law of large numbers) why splitting the stake doesn't make a visible difference? +> - [ ] Is it pointless to include such repetitive plots? + +Short Leios also relies on the EB being included in an RB before another EB is produced: endorser blocks are not allowed to queue awaiting RBs or to reference other EBs. Thus, the EB rate should be consistent with the RB rate. We can approximate the probability of there not being an RB in time to include a new EB by answering the following question: given that an EB has been produced, what is the probability that the *next* EB will be produced before the *first* RB? In such a situation, the freshest-first rule would result in the first EB being discarded because the second EB is included in the new RB. The distribution of waiting times for the next EB and next RB are + +$$ +\mathcal{P}(\text{next EB at phase } n) = (1 - f_\text{EB})^{n-1} f_\text{EB} +$$ + +and + +$$ +\mathcal{P}(\text{next RB at slot } s) = (1 - f_\text{RB})^{s-1} f_\text{RB} +$$ + +Hence the probability for the next EB not coming before the next RB is + +$$ +p_\text{included} = 1 - p_\text{discarded} = 1 - \sum_{1 \le s \lt n L \lt \infty} (1 - f_\text{EB})^{n-1} f_\text{EB} (1 - f_\text{RB})^{s-1} f_\text{RB} + = \frac{1 - [f_\text{RB} (f_\text{EB} - 1) + 1] (1 - f_\text{RB})^{L-1}}{1 - (1 - f_\text{EB}) (1 - f_\text{RB})^L} +$$ + +The plots below indicate that the phase length $L$ should be several times ${f_\text{RB}}^{-1}$ in order for there to be a high probability that an RB will appear before the next EB. + +![Probability of the next RB prior to the next EB](../images/leios-eb-rb.svg) + +> [!IMPORTANT] +> +> The above argument needs reworking because it doesn't account for various effects like the EB being per-pipeline, propagation delays, and the RB being per-slot or that there are multiple pipelines. There also may be ambiguities in the specification for the case when several EBs are waiting for an RB: presumably, the "freshest first" rule would be applied here, so the newest EB would go into the RB. We might need simulation for this analysis. + ### Votes diff --git a/images/leios-eb-rb.svg b/images/leios-eb-rb.svg new file mode 100644 index 00000000..a6f74343 --- /dev/null +++ b/images/leios-eb-rb.svg @@ -0,0 +1,178 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +RB rate = 0.05 per slot + + + + + + + + + + +RB rate = 0.1 per slot + + + + + + +0 +25 +50 +75 + + + + +0 +25 +50 +75 +0 +25 +50 +75 +100 + + + + + +L [slot] +Probability of next RB prior to next EB [%] + +EB rate +per phase + + + + + + + + + + + + + + + + + + +0.1 +0.2 +0.3 +0.4 +0.5 +0.6 +0.7 +0.8 +0.9 + + From cd9b676d2ab77b8f6bfa19be03c7cb9dc9e8ffa2 Mon Sep 17 00:00:00 2001 From: Brian W Bush Date: Thu, 26 Dec 2024 07:36:30 -0700 Subject: [PATCH 6/7] Wrote section on vote sortition. --- docs/technical-report-1.md | 41 ++++------- images/prob-leios-vote.svg | 142 +++++++++++++++++++++++++++++++++++++ 2 files changed, 157 insertions(+), 26 deletions(-) create mode 100644 images/prob-leios-vote.svg diff --git a/docs/technical-report-1.md b/docs/technical-report-1.md index 867738c5..c326527b 100644 --- a/docs/technical-report-1.md +++ b/docs/technical-report-1.md @@ -66,10 +66,10 @@ In Leios stake-based sortition occurs for the selection of the producers of IBs, EBs, and votes. The selection of the producers of IBs and EBs occurs similarly to Praos and the selection of the votes occurs similarly to Mithril. We define two functions for the probability of being elected as a producer or vote in the Bernoulli trials for each unit of stake (i.e., one lottery for each lovelace). Consider the situation where there is a total stake $S$ and a Leios node has $s$ stake delegated to it. Let $f^\prime$ be the probability that a single lovelace wins the lottery and $f$ be the probability that any of the $S$ lovelace win it: $$ -f = 1 - (1 - f^\prime)^S +f = 1 - (1 - f^\prime)^S \approx S \cdot f^\prime $$ -The probability that the node with $s$ stake delegated to it wins any of the lotteries is +where the approximation holds for the very large total stake $S$, which is greater than $10^{16}$ on the Cardano mainnet. The probability that the node with $s$ stake delegated to it wins any of the lotteries is $$ \psi_{f^\prime}(s) = 1 - (1 - f^\prime)^s @@ -206,36 +206,25 @@ The plots below indicate that the phase length $L$ should be several times ${f_\ > The above argument needs reworking because it doesn't account for various effects like the EB being per-pipeline, propagation delays, and the RB being per-slot or that there are multiple pipelines. There also may be ambiguities in the specification for the case when several EBs are waiting for an RB: presumably, the "freshest first" rule would be applied here, so the newest EB would go into the RB. We might need simulation for this analysis. - ### Votes +The sortition for voting differs from that for blocks because, in principle, a node may win several votes in the lottery. -#### Leader-election like voting[​](https://peras.cardano-scaling.org/docs/reports/tech-report-2#leader-election-like-voting "Direct link to Leader-election like voting") - -The first algorithm is basically identical to the one used for [Mithril](https://mithril.network/) signatures, and is also the one envisioned for [Leios](https://leios.cardano-scaling.org/) (see Appendix D of the recent Leios paper). It is based on the following principles: - -- The goal of the algorithm is to produce a number of votes targeting a certain threshold such that each voter receives a number of vote proportionate to σσ, their fraction of total stake, according to the basic probability function ϕ(σ)=1−(1−f)σϕ(σ)=1−(1−f)σ, -- There are various parameters to the algorithm: - - ff is the fraction of slots that are "active" for voting - - mm is the number of _lotteries_ each voter should try to get a vote for, - - kk is the target total number of votes for each round (e.g., quorum) kk should be chosen such that k=m⋅ϕ(0.5)k=m⋅ϕ(0.5) to reach a majority quorum, -- When its turn to vote comes, each node run iterates over an index i∈[1…m]i∈[1…m], computes a hash from the _nonce_ and the index ii, and compares this hash with f(σ)f(σ): if it is lower than or equal, then the node has one vote. - - Note the computation f(σ)f(σ) is exactly identical to the one used for [leader election](https://github.com/intersectmbo/cardano-ledger/blob/f0d71456e5df5a05a29dc7c0ac9dd3d61819edc8/libs/cardano-protocol-tpraos/src/Cardano/Protocol/TPraos/BHeader.hs#L434). - -We [prototyped](https://github.com/input-output-hk/peras-design/blob/73eabecd272c703f1e1ed0be7eeb437d937e1179/peras-vote/src/Peras/Voting/Vote.hs#L311) this approach in Haskell. - -#### Sortition-like voting[​](https://peras.cardano-scaling.org/docs/reports/tech-report-2#sortition-like-voting "Direct link to Sortition-like voting") - -The second algorithm is based on the _sortition_ process initially invented by [Algorand](https://web.archive.org/web/20170728124435id_/https://people.csail.mit.edu/nickolai/papers/gilad-algorand-eprint.pdf) and [implemented](https://github.com/algorand/sortition/blob/main/sortition.cpp) in their node. It is based on the same idea, namely that a node should have a number of votes proportional to their fraction of total stake, given a target "committee size" expressed as a fraction of total stake pp. And it uses the fact the number of votes a single node should get based on these parameters follows a binomial distribution. +$$ +\mathcal{P}(n \text{ votes given stake } s) = {s \choose n} {f_\text{vote}^\prime}^n (1 - f_\text{vote}^\prime)^{s-n} \approx \frac{(f_\text{vote} \cdot \sigma)^n \cdot e^{- f_\text{vote} \cdot \sigma}}{n!} +$$ -The process for voting is thus: +where $f_\text{vote} = f_\text{vote}^\prime \cdot S$ is the mean number of votes in the lottery and the stake is very large relative to the number of votes $n \ll s$. Note that even in this approximation the probabilities sum to one. The formula above and this approximation may be sufficient for use with a VRF to determine the number of votes that a node is entitled to. The table below show the VRF conditions for determining the number of votes in this approximation, which can be expressed as a Taylor series involving operations only on rational numbers. The Cardano mainnet decentralization parameter is current 500, so if we consider the case where the mean number of votes is 500 and a node has 1/500th of the stake, then $f_\text{vote} \cdot \sigma =1$ in this computations, corresponding to the right edge in the plot below: any pool with more stake would be oversaturated. Even in the worst-case scenario of a fully saturated pool, the probability of more than three votes is about 1.90%. -- Compute the individual probability of each "coin" to win a single vote pp as the ratio of expected committee size over total stake. -- Compute the binomial distribution B(n,p)B(n,p) where nn is the node's stake. -- Compute a random number between 0 and 1 using _nonce_ as the denominator over maximum possible value (e.g., all bits set to 1) for the nonce as denominator. -- Use [bisection method](https://en.wikipedia.org/wiki/Bisection_method) to find the value corresponding to this probability in the CDF for the aforementioned distribution. +| Number of votes | Condition of VRF value $v$ | +| --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| 0 | $0 \le v \cdot \exp(f_\text{vote} \cdot\sigma) \le 1$ | +| 1 | $1 \lt v \cdot \exp(f_\text{vote} \cdot\sigma) \le 1 + \sigma \cdot f_\text{vote}$ | +| 2 | $1 + \sigma \cdot f_\text{vote} \lt v \cdot \exp(f_\text{vote} \cdot\sigma) \le 1 + \sigma \cdot f_\text{vote} + \frac{1}{2} (\sigma \cdot f_\text{vote})^2$ | +| 3 | $1 + \sigma \cdot f_\text{vote} + \frac{1}{2} (\sigma \cdot f_\text{vote})^2 \lt v \cdot \exp(f_\text{vote} \cdot\sigma) \le 1 + \sigma \cdot f_\text{vote} + \frac{1}{2} (\sigma \cdot f_\text{vote})^2 + \frac{1}{6} (\sigma \cdot f_\text{vote})^3$ | +| $n$ | $\sum_{m=0}^{n-1} \frac{1}{m!} (\sigma \cdot f_\text{vote})^m \lt v \cdot \exp(f_\text{vote} \cdot\sigma) \le \sum_{m=0}^n \frac{1}{m!} (\sigma \cdot f_\text{vote})^m$ | -This yields a vote with some _weight_ attached to it "randomly" computed so that the overall sum of weights should be around expected committee size. +![Probability of number of votes given a stake fraction](../images/prob-leios-vote.svg) ## Voting and certificates diff --git a/images/prob-leios-vote.svg b/images/prob-leios-vote.svg new file mode 100644 index 00000000..dc9c7be3 --- /dev/null +++ b/images/prob-leios-vote.svg @@ -0,0 +1,142 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Mean votes f = 500 + + + + + + +3e-05 +1e-04 +3e-04 +1e-03 +1e-11 +1e-07 +1e-03 + + + +Fraction of stake [%/100] +Probability of the given number of votes [%/100] + +Number +of votes + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +1 +2 +3 +4 +5 +6 +7 +8 +9 +10 +11 +12 +13 +14 +15 + + From 4de256fd9005ce9311fb8e47587a5d804b25d5a2 Mon Sep 17 00:00:00 2001 From: Brian W Bush Date: Thu, 26 Dec 2024 08:54:35 -0700 Subject: [PATCH 7/7] Added insights about sortition to tech report and log book. --- Logbook.md | 20 ++++++++++++++++++++ docs/technical-report-1.md | 19 +++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/Logbook.md b/Logbook.md index 483ad649..68f8c450 100644 --- a/Logbook.md +++ b/Logbook.md @@ -1,5 +1,25 @@ # Leios logbook +## 2024-12-26 + +### Sortition + +Analyses of sortition for IBs, EBs, and votes were made and added to the draft of the first technical report. + +- All of the sortition is based on Bernoulli trials for each stake of lovelace. +- A probabilistic analysis shows that splitting stake among many nodes only provides a minor benefit in winning the lottery more times. +- The IB lottery is per-slot and limits a node to building a maximum of one IB per slot. + - Careful selection of protocol parameters ensures a high probability of at least one IB in each pipeline. +- The EB lottery is per-pipeline and limits a node to building a maximum of one EB per pipeline. + - The formulation of the EB lottery implies that there will always be a significant chance that a pipeline contains no EB. + - Setting protocol parameters so that there is a high probability of an EB in a pipeline makes the protocol more susceptible to influence by adversaries with significant stake. + - However, the unevenness in stake distribution or splitting of adversarial stake does not exacerbate the situation. +- The pipeline length should be several multiples of the inverse of the active-slot coefficient, in order that there is a high probability for an RB to be available for an EB certificate. +- The vote lottery can award multiple votes to the same node if they have a lot of stake and are luck. + - Nodes nearly saturated with stake have an appreciable chance of receiving several votes. + - The mathematics and code for determining from the VRF the number of votes a node winds is more complex and involves complications to avoid floating-point computations. + - Limiting nodes to a maximum of one vote would likely be safe if the mean number of votes is no larger than the effective decentralizations (i.e., the number of nodes with appreciable stake) would likely be safe, though it might result is larger concentrations of stake having smaller voting rewards, and it would greatly simplify the computation of sortition. + ## 2024-12-23 ### Rust simulation diff --git a/docs/technical-report-1.md b/docs/technical-report-1.md index c326527b..785aa1d6 100644 --- a/docs/technical-report-1.md +++ b/docs/technical-report-1.md @@ -226,6 +226,25 @@ where $f_\text{vote} = f_\text{vote}^\prime \cdot S$ is the mean number of votes ![Probability of number of votes given a stake fraction](../images/prob-leios-vote.svg) +Limiting nodes to a maximum of one vote would likely be safe if the mean number of votes is no larger than the effective decentralizations (i.e., the number of nodes with appreciable stake) would likely be safe, though it might result is larger concentrations of stake having smaller voting rewards, and it would greatly simplify the computation of sortition. + +### Insights regarding sortition + +- All of the sortition is based on Bernoulli trials for each stake of lovelace. +- A probabilistic analysis shows that splitting stake among many nodes only provides a minor benefit in winning the lottery more times. +- The IB lottery is per-slot and limits a node to building a maximum of one IB per slot. + - Careful selection of protocol parameters ensures a high probability of at least one IB in each pipeline. +- The EB lottery is per-pipeline and limits a node to building a maximum of one EB per pipeline. + - The formulation of the EB lottery implies that there will always be a significant chance that a pipeline contains no EB. + - Setting protocol parameters so that there is a high probability of an EB in a pipeline makes the protocol more susceptible to influence by adversaries with significant stake. + - However, the unevenness in stake distribution or splitting of adversarial stake does not exacerbate the situation. +- The pipeline length should be several multiples of the inverse of the active-slot coefficient, in order that there is a high probability for an RB to be available for an EB certificate. +- The vote lottery can award multiple votes to the same node if they have a lot of stake and are luck. + - Nodes nearly saturated with stake have an appreciable chance of receiving several votes. + - The mathematics and code for determining from the VRF the number of votes a node winds is more complex and involves complications to avoid floating-point computations. + - Limiting nodes to a maximum of one vote would likely be safe if the mean number of votes is no larger than the effective decentralizations (i.e., the number of nodes with appreciable stake) would likely be safe, though it might result is larger concentrations of stake having smaller voting rewards, and it would greatly simplify the computation of sortition. + + ## Voting and certificates