Safely generating L2 blocks based purely on L1 blockdata #226
Replies: 21 comments
-
@ben-chain dope writeup! Is there a really simple way to explain why the keccak in It is because of what @protolambda calls "state poisoning" - this happens when state is loaded into the machine without the ability to disprove its validity. The reason why you cannot disprove the first chunk's state access is because the validity of that state access is actually dependent on the final hash that is computed. Then you can retroactively say "that was an invalid chunk". The part of this that I feel hazy about is whether it is helpful to determine where you disagree in the hash chunking process, and then afterwards post all the data. Random ideas
|
Beta Was this translation helpful? Give feedback.
-
@ben-chain can we write this in a way such that it is non-interactive? IMO that would simplify it conceptually a lot. Non-interactive but expensive. Even though it sounds somewhat nightmarish to do, it would be super isolated & seems relatively easy to test. |
Beta Was this translation helpful? Give feedback.
-
This is the right intuition, yes. The problem with "trying to find out where you disagree" is that an attacker will always try to disagree at the first chunk, devolving the game into rolling up the entire thing. |
Beta Was this translation helpful? Give feedback.
-
@karlfloersch this statement isn't obviously true to me, but perhaps I'm missing something? I can't think of many on-chain applications that in practice would depend on being able to receive 30MM gas worth of call data. Until recently they were all making do with more like 13MM on L1 anyways. IMO, implementing an on-chain sponge function (as fun as it sounds!) is a major endeavour in itself, if it is essential to the mission of creating an EVM equivalent ORU then we must do it. But if it results in something like a 20% to 30% increase in maximum TxData size, I'm less certain of it's value. |
Beta Was this translation helpful? Give feedback.
-
It's less about the feature and more about protocol complexity. We have the choice of highly encapsulated complexity (implement a pre-image verifier for any size preimage) vs leaky complexity across multiple parts of our protocol. We need to add logic in Geth to reject txs over a specific size, make preimage oracle to return Some other places where I'm afraid of complexity popping up:
TLDR: I agree that supporting large txs is not a feature that users are clamoring for; however I think using this seemingly simple timeout approach will actually introduce gnarly complexity. + ofc EVM equivalence. |
Beta Was this translation helpful? Give feedback.
-
Okay thanks for humoring me again on the call. I think I understand the interactive game over the commitment chain now. So if my understanding is correct, I can cause an otherwise valid chain to rollback at any point during the fraud proof period: Suppose I submit the rollup transition It seems likely that this could be used to double spend since most people won't wait for rollup finality, only L1 finality. |
Beta Was this translation helpful? Give feedback.
-
So there are two things I would say here:
|
Beta Was this translation helpful? Give feedback.
-
Ah I forgot that each interaction needs to disputable. I think in that case my example doesn't work, because one honest party could dispute my fraudulent fraud proof. |
Beta Was this translation helpful? Give feedback.
-
Yep, IMO this is the worst part of the complexity -- I think fundamentally this forces us to choose between:
With unlimited preimage sizes, you might also imagine a world in which L2 contracts can pay to "subscribe" to all L1 events of a given address/topic. I think this is fundamentally impossible without that. |
Beta Was this translation helpful? Give feedback.
-
Update on the above: @karlfloersch pointed out that we can do |
Beta Was this translation helpful? Give feedback.
-
This decision changes the implementation of the deposit feed. If we can load large preimages, then the deposit feed is a contract which just emits events. If we cannot load large preimages, then the deposit feed much enforce the size of the deposit data is small and store the hash of the deposit. |
Beta Was this translation helpful? Give feedback.
-
Trying to reformulate the whole thing, and adding some new insights at the end, based on semi-recent team conversations, but adding a new way to operationalize it that is all new. In the new design, we run the fraud proof on a run of Geth that validates an epoch worth of blocks. The fraud proof consists of a dispute game where both parties converge on a single step whose execution they disagree on. At this stage, it is necessary to seed the preimage oracle with the preimages of any data accessed by the contested instruction. These data could be:
We would compile Geth to MIPS (or some similar minimal instruction set). Above, (1) is a MIPS instruction, while (2), (3) and (4) map to "special" instructions (the "preimage oracle") to access the fraud proof program's input (data in some L1 block). The trouble is case (4), because an entry in the receipt trie comprises all log entries for the transaction. A contract that submits a deposit could first emit a huge log filled with 0, which would exceed what can be held in a block's worth of calldata. (Side note, we can't store deposits in calldata because we want to let contracts submit deposits, and only EOA can produce "top-level" transactions.) As Karl says just above, a possibility is simply to check the size of deposits and store their hash. The other option is to somehow enable to submit the proven receipt trie entry in multiple stages. This is what all the talk about decomposing the hash function is about. EDIT: The below does not work, see @protolambda comment and my follow-up comment with a method that works.
|
Beta Was this translation helpful? Give feedback.
-
The problem is not the computation, but the authentication, of input data. There is no way to proof the input to the hash-function is correct unless you can verify it against the output in the same step of the dispute. Rollups work because the inputs are all already authenticated: the hash function incrementally computes the output over trusted memory. But when loading new pre-images, the input is not authenticated, and has to be consumed in one go to establish that it's the correct input to use. See "state-poisoning" for earlier thoughts on this. To solve the computation you can indeed do a rollup-like multi-step computation of the keccak hash, and dispute steps in-between, but the problem is that the input data itself is never authenticated. Determining the contents of the very first step, when you only have the final output hash to trust the input, is not possible. You could try reverse the execution of the steps, by presenting the last chunk of data and checking it matches the output when combined with the internal hashing state, and then working your way back all the chunks. But that's not secure in this case, since keccak is not following a Merkle–Damgård construction like sha256. |
Beta Was this translation helpful? Give feedback.
-
Figured out the same over the weekend. Essentially the issue is that if that chunk need to be proved upstream of the fraud proof (while seeding the preimage oracle) in case they disagree on a chunk. (Note that the term "preimage oracle" is really awkward here.) Given this, the protocol needs to prove the whole receipt entry. Here is a protocol of how this could be done while seeding the preimage oracle:
This logic only proves a receipts entry, additional logic is required to parse the entry and extract logs matching deposits. This can be done in solidity while seeding or in geth directly (the difference would be wether the "special instructions" fetch a whole receipt entry, or just a deposit). |
Beta Was this translation helpful? Give feedback.
-
Running all the computation on-chain works, but it's very costly (a full L1 block or more!), so let's consider the alternatives as well. One other approach is to do a fraud-proof over the thing that generated the large pre-image (L1 execution) at the same time (execute the L1 source block before executing the derived L2 block(s)). Then keep the receipt (or whatever large input) in the authenticated memory, to avoid using the pre-image oracle for it later. The pros:
The cons:
Or another alternative is to merkleize receipts in some better way on L1, and see if it can be adopted during the Merge (transactions are already going to be merkleized nicely, and will be accessible through a binary merkle proof via the beacon-block hash-tree-root) |
Beta Was this translation helpful? Give feedback.
-
Been thinking about this over the weekend, finally took the time to write it out. I think this works, but I'd appreciate a review in case I missed something. The IssueWhen executing the single execution step during a fault proof, it is necessary to prove the input to the execution step. This could include:
The problem occurs in (2): all log entries generated by a transaction are concatenated before being hashed for inclusion in the receipts Merkle tree. A malicious actor could emit huge log entries before making a deposit, hence resulting in a hash pre-image that is too large to post on-chain as calldata. See here for some numbers, a byte of log data costs 8 gas, meaning it could result in ~3.75MB of data. Assuming none of these bytes are 0, this would take two full blocks worth of calldata to post on chain. This is much too large/expensive. Note that this problem does not occur in (1), since we can limit the batch size at the protocol level (define larger batches as invalid), nor in (3) as slots are only 256 bits, nor in (4) as we are free to define how we Merkleize program state. The Nice SolutionFirst, we can prove the hash of receipt Merkle tree entry using the usual Merkle proof mechanism but cut short to not prove the data, but only its hash. The problem is then: how to prove a chunk of the hashed input without posting the whole input on-chain. The Keccak (SHA-3) hashing algorithm can be decomposed as the repeated application of a function (let's call it Let's call the input (the receipt entry) Then the hash can be represented by:
(i.e.
i.e. The two participants they can post all intermediate states they think leads to the hash. If they agree, any single chunk of input "All internal states" might be too big in practice, but they can post a commitment to them instead. Then, they can play a bisection game on this commitment to find the last intermediate state they agree on. If the participants disagree on some intermediate state, we take If |
Beta Was this translation helpful? Give feedback.
-
@norswap Regarding the nice solution, splitting up the hash function: Some hash-functions have this feature natively: Merkle–Damgård construction. Unfortunately I don't believe SHA-3 is one of those (iirc @ben-chain asked a friend to confirm when we discussed this option in November) |
Beta Was this translation helpful? Give feedback.
-
Yes, it uses the sponge construction. Reading about it right now, it seems clear it's not a simple repeated application of a function, but it almost is (the key difference is that there are two functions, one for the "absorption phase" and one for the "squeeze out phase", where the squeeze function does not take further input and does not xor). It's not quite clear to me that the proposed scheme can't work. In fact, I'd say it has to work, but the concern is the degree of attack resistance: i.e. if you can reverse/find collisions for every step (function application), then you can find a collision for the whole hash, which is obviously not possible at present. But maybe it is too expensive to find a full hash collision, but cheap enough to find a collision for a single step. |
Beta Was this translation helpful? Give feedback.
-
Unlike other rollup execution (repeated function application), this is not on top of prior authenticated data. We only have the output, and repeat the steps by showing the supposed pre-state and applying the function to see if it matches the output we already authenticated. Finding collisions on an intermediate input can be a lot easier, depending on the hash function design. Since part of the security of the full hashing may be derived from the output being unique to a certain starting state of the hash function, which we don't have to show if we only present a collision for an intermediate step that connects with the expected output. |
Beta Was this translation helpful? Give feedback.
-
Yup, that's exactly what I meant! Remains to figure out what the actual security here is. |
Beta Was this translation helpful? Give feedback.
-
Okay, so I spent some time thinking about this, and I've convinced myself it isn't possible. The reason is actually pretty interesting. Turns out what I called |
Beta Was this translation helpful? Give feedback.
-
// see also discussion here*
Current State of L2s
In general, the nature of a rollup is such that the L2 state can be entirely defined by the L1 blocks. That is, all possible L2s can be expressed by state transitions taking the form
f(old_l2_state, new_l1_block) -> new_l2_state
.In reality, most (all?) rollups today do not act on raw blockdata, but instead use an L1 smart contract to store a hash of the calldata sent to a system contract during L1 block execution. This function lived in
OVM_CanonicalTransactionChain
for our v1, for example. Here's where Arbitrum does it as well.The Ideal
Our goal of EVM equivalence should make the idea of acting on raw L1 block data more feasible. In reality, a committment to input calldata is already stored within the L1 blockhash itself, and we could use this as the verification source during the fraud proof. This has been proposed by @lightclient and others. It would give us optimal gas costs, and also set us up for ETH2p1, in which the place that calldata is submitted to will not have any smart contract capabilities.
Problem Statement
The above post proves that it's feasible to give the L2 virtual machine access to some
get_l1_blockhash(number)
function. However, it does not tell us how we can safely use it. Geohot has proposed one way to use this blockhash in a very generalized form: give the L2 VM a function, effectively a "get_known_preimage(bytes32 hash)
", which allows it to recover the preimage of a hash into memory. The L2 VM could use this to unpack the block body from the block hash, then unpack the transaction root's children from that, and so on, until it unpacks the relevant transaction data itself.However, there is a problem: the preimages, even if known off-chain, may be too big to do any verification on-chain. For example, imagine that a block gaslimit was entirely filled with a massive, single transaction (identified by
txhash
) to a dummy account. It should be clear that there is no way to verify theget_known_preimage(txhash)
step on-chain, because there's not enough gas. How should we deal with this?Solution 1: Upper bound on preimage sizes
The most obvious solution to this is just to enforce a maximum preimage size that can be successfully loaded into memory by the L2 VM. This would mean that
get_known_preimage
returnsnull_bytes32
if the realpreimage.length > UPPER_BOUND
.To detect this on-chain, an honest party would just allow their challenger to run out of time on the dispute game because it is impossible to prove anything else. This is the easiest to implement, but it has the downside that the fraud proofs can no longer validate arbitrary L1 blocks -- only those below some size constraint.
Solution 2: Secondary game on preimage chunks
The keccak algorithm itself can be broken down into smaller individual steps which each ingest a word of the input into a constant-sized state. So another solution is to have the two parties submit the preimage they claim is correct in smaller chunks as calldata. Once all are submitted, then they can play a bisection game on the hash substeps to prove that their preimage will result in the requested hash. This adds complexity, but does allow us to validate arbitrary L1 blocks in their entirety.
Beta Was this translation helpful? Give feedback.
All reactions