You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: Logbook.md
+53-37
Original file line number
Diff line number
Diff line change
@@ -2,13 +2,29 @@
2
2
3
3
## 2024-12-06
4
4
5
+
### ΔQ
6
+
7
+
- created a ΔQ model (`comparison_rs.txt`) of transaction diffusion in the Rust simulation:
8
+
- propagation among the five clusters followed by propagation within the 40 nodes of each cluster
9
+
- only fixed message delays of 12ms, 69ms, 268ms, independent of message size
10
+
- general structure of completion matches the timings, but the completion rate is overall quite different
11
+
- spreading to neighbor clusters (3×68ms) followed by another such hop should hit all clusters, but that also doesn’t happen in the simulation, it waits until 3×268ms before it can break through 74% completion
12
+
-**conclusion:** I don’t really understand what the simulation is doing, even though the Rust code looks obvious enough, and obviously correct on the node level; will dive into the machine room later
13
+
- created a ΔQ model (`comparison_hs.txt`) of Praos block diffusion in the Haskell simulation:
14
+
- hs simulates TCP window collapse, which adds a very latency-dependent additional delay to block transfer times — I wasn’t able to adequately model that, plausible ΔQ expressions lead to too slow completion
15
+
- when TCP window collapse is hacked out (thanks Andrea!) I get close matching of the result with a ΔQ expression, however, that expression does not match the stated simulation behaviour
16
+
- in particular: it matches only when assuming that blocks are _not validated_ during relaying, only afterwards before adoption
17
+
- one suspicious detail: according to my (hopefully not buggy!) measurement, the network topology for the hs simulation has a clustering coefficient of exactly zero — I was unable to find a single triangle
18
+
5
19
### Haskell simulation
6
20
7
21
First Leios visualisations implemented (on `andrea/leios-p2p` branch atm):
22
+
8
23
- short-leios-1: 2 nodes, showing every mini-protocol message.
9
24
- short-leios-p2p-1: 100 nodes, showing transfers of RB,IB,EB,Votes and some statistics.
10
25
11
26
Next steps:
27
+
12
28
- Improve readability of short-leios-p2p-1 to differentiate pipelines
13
29
and kinds of blocks.
14
30
- Verify parameters are set to sensible values (in particular wrt
@@ -30,33 +46,33 @@ This all is a work and progress and values may change significantly in the futur
30
46
31
47
- Sortition: 50 ms
32
48
- Votes
33
-
- Number: 500
34
-
- Size: 500 B
35
-
- Construction: 0.65 ms
36
-
- Verification: 0.15 ms
49
+
- Number: 500
50
+
- Size: 500 B
51
+
- Construction: 0.65 ms
52
+
- Verification: 0.15 ms
37
53
- ALBA certificate
38
-
- Size: 75 kB
39
-
- Construction (aggregation plus proof): 200 ms
40
-
- Verification: 0.15 ms
54
+
- Size: 75 kB
55
+
- Construction (aggregation plus proof): 200 ms
56
+
- Verification: 0.15 ms
41
57
42
58
### Draft of several sections of the first tech report
43
59
44
60
We now have a full draft of several sections of the technical report.
45
61
46
62
- Cost analysis
47
-
- Simulation of transaction volume on Cardano
48
-
- Estimation of costs for a Leios SPO
49
-
- Cost of storage
50
-
- Break-even cost for perpetual storage of blocks
51
-
- Compressed storage of Praos blocks
63
+
- Simulation of transaction volume on Cardano
64
+
- Estimation of costs for a Leios SPO
65
+
- Cost of storage
66
+
- Break-even cost for perpetual storage of blocks
67
+
- Compressed storage of Praos blocks
52
68
- Rewards received
53
-
- Importance of Cardano Reserves
69
+
- Importance of Cardano Reserves
54
70
- Insights for Leios techno-economics
55
71
- Approximate models of Cardano mainnet characteristics
56
-
- Transaction sizes and frequencies
57
-
- Stake distribution
72
+
- Transaction sizes and frequencies
73
+
- Stake distribution
58
74
59
-
Work is in progress on voting and certificates in https://github.com/input-output-hk/ouroboros-leios/pull/94. The following subsections have been fully drafted:
75
+
Work is in progress on voting and certificates in <https://github.com/input-output-hk/ouroboros-leios/pull/94>. The following subsections have been fully drafted:
60
76
61
77
- Voting and certificates
62
78
- Structure of votes
@@ -334,7 +350,7 @@ Findings:
334
350
335
351
### Techno-economic analysis of SPO nodes
336
352
337
-
The *Refined Estimate* tab of the [Leios High-Level Resources Estimates spreadsheet](analysis/Leios%20resource%20estimates%20-%20ROUGH%20ESTIMATE.ods) computes node costs for SPOs under Praos and Leios.
353
+
The _Refined Estimate_ tab of the [Leios High-Level Resources Estimates spreadsheet](analysis/Leios%20resource%20estimates%20-%20ROUGH%20ESTIMATE.ods) computes node costs for SPOs under Praos and Leios.
338
354
339
355
- Each SPO has one block producer and two relays.
340
356
- CPU, IOPS, disk, and network costs are estimated.
@@ -721,7 +737,7 @@ Agenda:
721
737
- IB of shard i should not have tx consuming token of shard j
722
738
- fees of IB i are paid with token shard i
723
739
- ensure IB from different shards will never consume token from other shards
724
-
-*important* : fees are always paid, even if tx is not included in the ledger
740
+
-_important_ : fees are always paid, even if tx is not included in the ledger
725
741
- Q: what about multiple tokens per UTxO?
726
742
- grinding with people trying to overload one shard?
727
743
-\# shards w.r.t IB rate => decrease probability of concurrent IBs for the same shard
@@ -781,11 +797,11 @@ The diagram above illustrates a techno-economic business case for Leios adoption
781
797
782
798
We could consider the following goals for January 2025.
783
799
784
-
-*Technical goal for PI8:* Estimate a reasonably tight upper bound on the cost of operating a Leios node, as a function of transaction throughput, and estimate the maximum practical throughput.
800
+
-_Technical goal for PI8:_ Estimate a reasonably tight upper bound on the cost of operating a Leios node, as a function of transaction throughput, and estimate the maximum practical throughput.
785
801
- Target level: SRL2
786
-
-*Business goal for PI8:* Identify (a) the acceptable limit of transaction cost for Cardano stakeholders, (b) the maximum throughput required by stakeholders, and (c) the throughput-cost relationship for other major blockchains.
802
+
-_Business goal for PI8:_ Identify (a) the acceptable limit of transaction cost for Cardano stakeholders, (b) the maximum throughput required by stakeholders, and (c) the throughput-cost relationship for other major blockchains.
787
803
- Target level: IRL3
788
-
-*Termination criteria for Leios:* Transaction costs are unacceptably high for Leios or the practical maximum throughput fails to meet stakeholder expectations. In this case the Leios protocol may need reconceptualization and redesign, or it may need to be abandoned.
804
+
-_Termination criteria for Leios:_ Transaction costs are unacceptably high for Leios or the practical maximum throughput fails to meet stakeholder expectations. In this case the Leios protocol may need reconceptualization and redesign, or it may need to be abandoned.
789
805
790
806
### Haskell Simulation
791
807
@@ -973,7 +989,7 @@ Main question is what to test (first)? And how to test? Network diffusion seems
973
989
- The node can fetch new headers and blocks
974
990
- The node can diffuse new headers and blocks
975
991
- It must node propagate equivocated blocks more than once
976
-
- But it must propagate them at least once to ensure a *proof-of-equivocation* is available to all honest nodes in the network
992
+
- But it must propagate them at least once to ensure a _proof-of-equivocation_ is available to all honest nodes in the network
977
993
978
994
How does coverage comes into play here?
979
995
@@ -1008,7 +1024,7 @@ Discussing some possible short-term objectives:
1008
1024
- start with Adversarial scenarios, answering the question on where to define the behaviour: in the spec or in the tester?
1009
1025
- simulatios/prototypes will need to have some ways to interact w/ tester => interfaces can be refined later
@@ -1079,16 +1095,16 @@ We can run the conformance tests in the ledger spec :tada:
1079
1095
#### What approach for Leios?
1080
1096
1081
1097
- We don't have an executable Agda spec for Leios, only a relational one (with holes).
1082
-
- We need to make the spec executable, but we know from experience with Peras that maintaining *both* a relational spec and an executable spec is costly
1098
+
- We need to make the spec executable, but we know from experience with Peras that maintaining _both_ a relational spec and an executable spec is costly
1083
1099
- to guarantee at least soundness we need to prove the executable spec implements correctly the relational one which is non trivial
1084
1100
- Also, a larger question is how do we handle adversarial behaviour in the spec?
1085
1101
- it's expected the specification uses dependent types to express the preconditions for a transition, so that only valid transitions can be expressed at the level of the specification
1086
-
- but we want the *implementaiton* to also rule out those transitions and therefore we want to explicitly test failed preconditions
1102
+
- but we want the _implementaiton_ to also rule out those transitions and therefore we want to explicitly test failed preconditions
1087
1103
- then the question is: how does the (executable) specification handles failed preconditions? does it crash? can we know in some ways it failed?
1088
1104
- we need to figure how this is done in the ledger spec
1089
1105
- In the case of Peras, we started out modelling an `Adversary` or dishonest node in the spec but this proved cumbersome and we needed to relax or remove that constraint to make progress
1090
1106
1091
-
- however, it seems we really want the executable spec to be *total* in the sense that any sequence of transitions, valid or invalid, has a definite result
1107
+
- however, it seems we really want the executable spec to be _total_ in the sense that any sequence of transitions, valid or invalid, has a definite result
1092
1108
1093
1109
- we have summarized short term plan [here](https://github.com/input-output-hk/ouroboros-leios/issues/42)
1094
1110
- we also need to define a "longer" term plan, eg. 2 months horizon
@@ -1260,7 +1276,7 @@ ND starts raising a few concerns he has about leios that should be answered:
1260
1276
- How does it work at saturation?
1261
1277
1262
1278
A key issue is potential attack vector that comes from de-duplicating txs: how is it handled by Leios forwarding infra? In general, how does Leios deals with adversarial behaviour?
1263
-
We acknowledge this needs to be answered, and there's work on mempool management that needs to happen, but that's not the core topic we want to work on *now*
1279
+
We acknowledge this needs to be answered, and there's work on mempool management that needs to happen, but that's not the core topic we want to work on _now_
1264
1280
1265
1281
Another important question to answer is "What resources are needed?" as this has a deep impact on centralisation:
1266
1282
@@ -1633,7 +1649,7 @@ Here are a few comments by @bwbush about the `leios-sim` package:
1633
1649
1634
1650
Added some documentation to the Leios simulator:
1635
1651
1636
-
- Added *tooltips* to document the various parameters available
1652
+
- Added _tooltips_ to document the various parameters available
1637
1653
- Added readonly fields computing various aggregates from the simulation's data: Throughput, latency to inclusion in EB, dropped IB rate
1638
1654
- Added a [comment](https://github.com/input-output-hk/ouroboros-leios/issues/7#issuecomment-2236521300) on the simulator issue as I got perplexed with the throughput computation's result: I might be doing something wrong and not computing what I think I am computing as the results are inconsistent. I think this comes from the fact we are simulation 2 nodes so the throughput aggregates the 2 nodes' and should be assigned individually to each one, perhaps more as a distribution?
1639
1655
@@ -1656,12 +1672,12 @@ Managed to configure the ECS cluster, service, and task to run the image, but it
1656
1672
1657
1673
need to configure a secret containing a PAT for pulling the manifest: <https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#container_definition_repositoryCredentials>
1658
1674
1659
-
I gave up trying to run on AWS, every solution I found is an insanely intricate maze of stupidly complicated solution which I don't care about as I only need to deploy a *single* image without any data dependency attached.
1675
+
I gave up trying to run on AWS, every solution I found is an insanely intricate maze of stupidly complicated solution which I don't care about as I only need to deploy a _single_ image without any data dependency attached.
1660
1676
1661
1677
I managed to get Gcloud run deployment working, mostly copy pasting what I did peras and fiddling with it.
1662
1678
1663
1679
- I reused same service account than Peras which is a mistake -> should create a new service account with limited rights
1664
-
- Needeed to add service account as an *owner* of the domain in the google console (a manual task) in order to allow subdomain mapping
1680
+
- Needeed to add service account as an _owner_ of the domain in the google console (a manual task) in order to allow subdomain mapping
1665
1681
- Changed the server code to support defining its port from `PORT` environment variable which is provided by the deployment configuration
1666
1682
1667
1683
Allowing anyone to access the server proved annoying too: The folowing configuration works
@@ -1790,7 +1806,7 @@ The recording is available on GDrive: <https://drive.google.com/file/d/1r04nrjMt
1790
1806
1791
1807
Discussing with researchers on some early simulations that are being worked on for Leios.
1792
1808
1793
-
- Constraint: Setup threshold on *egress* bandwidth, then simulate diffusion of a block to downstream peers
1809
+
- Constraint: Setup threshold on _egress_ bandwidth, then simulate diffusion of a block to downstream peers
1794
1810
- upstream sends notificatoin (Eg. header)
1795
1811
- downstream asks for block body if it does not have it
1796
1812
- then it "validates" (simulated time) and advertises to neighbours
@@ -1804,7 +1820,7 @@ Discussing with researchers on some early simulations that are being worked on f
1804
1820
- δ = 8 (4 inbound, 4 outbound)
1805
1821
- b/w limit = 1Mb/s
1806
1822
- block size ~ 1kB
1807
-
- when sending 10 blocks/s we observe more variation, a bit more contention as the *freshest first* policy starts to kick in
1823
+
- when sending 10 blocks/s we observe more variation, a bit more contention as the _freshest first_ policy starts to kick in
1808
1824
- at 1block/ms there's a much wider variation in time it takes to reach nodes
1809
1825
- the first blocks take the longest as the queues are filling up with fresher blocks
1810
1826
- latest blocks go faster, almost as fast as when rate is much slower, but this is also an artifact of the simulation (eg. time horizon means there's no block coming after which decreases contention)
@@ -1854,7 +1870,7 @@ Spyros will work this week on network simulation for Leios
1854
1870
- need to queue local actions according to bandwidth availability
1855
1871
- main input parameter is IB generation rate
1856
1872
- output = delivery ratio of IBs
1857
-
- if IB rate > threshold -> most blocks won't make it because of *freshest first* policy
1873
+
- if IB rate > threshold -> most blocks won't make it because of _freshest first_ policy
1858
1874
1859
1875
Next steps:
1860
1876
@@ -1961,10 +1977,10 @@ Here is some draft we drew:
1961
1977
1962
1978
Couple explanations:
1963
1979
1964
-
- Upper part is about *equivocation*, eg. an adversary producing different IBs at the same slot.
1965
-
- a node will observe the equivocation (on the far right) by being offered 2 *equivocated* headers from different peers
1966
-
- This node will be able to produce a *proof of equivocation* that's useful when voting for IBs (and EBs?)
1967
-
- Lower part is about *freshest first* download policy: Two nodes producing valid IBs at different slots.
1980
+
- Upper part is about _equivocation_, eg. an adversary producing different IBs at the same slot.
1981
+
- a node will observe the equivocation (on the far right) by being offered 2 _equivocated_ headers from different peers
1982
+
- This node will be able to produce a _proof of equivocation_ that's useful when voting for IBs (and EBs?)
1983
+
- Lower part is about _freshest first_ download policy: Two nodes producing valid IBs at different slots.
1968
1984
- given the choice of headers (and bodies) consumer node will choose to download the freshest body first, eg. B in this case
1969
1985
- headers are downloaded in any order as we can't know whether or not they are "freshest" before reading them
1970
1986
- It seems that's only relevant if there are more blocks offered than available bandwidth :thinking:
0 commit comments