Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: verkle integration with disk db #10958

Open
wants to merge 14 commits into
base: verkle
Choose a base branch
from

Conversation

1010adigupta
Copy link
Contributor

@1010adigupta 1010adigupta commented Sep 17, 2024

This is the first pr for the integration of verkle tries in reth, and finally, code modifications for stateless-reth using verkle, I am doing this project as part of my Ethereum protocol fellowship with EF-stateless team, the project will first aim to integrate verkle in reth, then it will aim to pass all the execution-spec-tests designed for Kaustinen devnet-7(verkle's devnet), latest test fixtures for this devnet were released earlier and are available here: verkle-devnet-7-test-fixtures, I have started implementing these tests, beginning with EIP-4762, EIP-6800, remaining EIPs like EIP-7709 will be implemented subsequently, tough these tests can't be added directly to reth since this is blocked by #10595 , since reth does not have yet tool like geth's evm-t8n for running execution-spec-tests (I may be missing something here), hence I am using currently hive for these tests
aim of this pr is to integrate rust-verkle into reth's storage, tough future iterations will have a native verkle-trie, using Alloy-types inherently
would like reviews, and further discussions on it @rkrasiuk @onbjerg @mattsse @gakonst
@mattsse ig a separate branch for verkle might be needed, will push my changes there, also EF-testing team requires a separate branch for final devnet tests?

Comment on lines +19 to +29
pub struct VerkleDb<'a, TX> {
/// The underlying key value database
// We try to avoid fetching from this, and we only store at the end of a batch insert
pub tx: &'a TX,
/// This stores the key-value pairs that we need to insert into the storage
// This is flushed after every batch insert
pub batch: MemoryDb,
/// This stores the top 3 layers of the trie, since these are the most accessed
// in the trie on average
pub cache: MemoryDb,
}
Copy link
Contributor Author

@1010adigupta 1010adigupta Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VerkleDb is the main db struct responsible for managing trie-operations, and finally flushing them to disk once cache-depth exceedes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand a bit on the intended usage here? I'm asking because generally we avoid having long-lived read transactions open as they lead to bloat because of how MDBX works

Ideally we'd have a way to feed a set of state transitions, a temporary handle to the database and then get the changes we need to apply to the database from this. This is how we do it with the current state commitment impl and it was specifically designed to avoid MDBX pitfalls

Alternatively, instead of holding a tx, you can hold a DatabaseProvider that can give you a RO tx when needed

Copy link
Contributor Author

@1010adigupta 1010adigupta Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, this approach will delay the garbage collection of MDBX, the intended use case was this:
To use rust-verkle's trie, the DB-interface that will be provided to VerkleConfig to initiate the trie which in turn will be used for all trie-operation requires to implement Flush, ReadOnlyHigherDb and WriteOnlyHigherDb traits, the functionalities of these traits will then be used internally for trie actions, see test_trie_root_commitment here, of here Flush trait is used to Flush all the changes that are currently stored in cache to permanent storage, and functions of ReadOnlyHigherDb is used for trie operations for reading data, and WriteOnlyHigherDb writes data to the cache, so basically VerkleDb works like a in-memory-db, and writes final data to disk only when flush is called and reads data from disk only when it's not found in cache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two approaches suggested in rust-verkle:

  1. see rocksDB impl either we implement similar functionality for our MDBX or
  2. what I have done till now, create a DB struct and implement required traits for it
    see here for more reference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do this:

  • for Flush will follow the existing method for state-commitments
  • for other read methods, will supply DatabaseProvider , instead of keeping a single transaction open
    redesigning the method, adding the fix

Comment on lines 103 to 110
let mut trie_cursor = self.tx.cursor_read::<tables::VerkleTrie>().ok()?;
let mut labelled_key = Vec::with_capacity(key.len() + 1);
labelled_key.push(LEAF_TABLE_MARKER);
labelled_key.extend_from_slice(&key);
trie_cursor.seek_exact(labelled_key)
.ok()
.and_then(|result| result)
.and_then(|(_, value)| value.try_into().ok())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is redundant in all impls of verkleDb, ig need to add functionality such that, to avoid multiple cursor creation

use reth_db_api::{cursor::DbCursorRW, transaction::DbTxMut};
use reth_provider::test_utils::create_test_provider_factory;
#[test]
fn test_verkle_storage_cursor_abstraction() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for basic cursor abstraction



#[test]
fn test_trie_root_commitment() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for trie operations over VerkleDb

@@ -387,6 +387,8 @@ tables! {
/// From HashedAddress => NibblesSubKey => Intermediate value
table StoragesTrie<Key = B256, Value = StorageTrieEntry, SubKey = StoredNibblesSubKey>;

/// Stores current state of verkle tree.
table VerkleTrie<Key = Vec<u8>, Value = Vec<u8>>;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using Vec<u8> as the specified type for key-value, since this is what was suggested in rust-verkle's trait impls, but need some advice here, ig will have to convert it to alloy or our own custom type

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where specifically? We'd prefer to use Bytes if possible, I glaced at rust-verkle and it seems they have some &[u8] keys in the repo, we can convert these to Bytes. But not necessarily a blocker

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes all keys and values are ultimately converted to &[u8] no matter what, will use Bytes for them, adding the fix

@1010adigupta 1010adigupta changed the title verkle integration with disk db feat: verkle integration with disk db Sep 17, 2024
@rakita
Copy link
Collaborator

rakita commented Sep 18, 2024

@1010adigupta we talked about expectations and verkle in DMs. Due to its complexity, this is not something that we have resources to review rn.

@rakita rakita closed this Sep 18, 2024
@onbjerg
Copy link
Member

onbjerg commented Sep 18, 2024

Relaying here -

I'm reopening this because I think this is valuable. However please note verkle is low prio for us right now, so reviews might not be super thorough. I also don't think we want to merge this into main, at least for now, but I'll set up a verkle branch we can merge stuff into - it's ok if this diverges from main, so don't bother rebasing it that much. This will be helpful to see what parts of the node are affected by verkle and can perhaps inform us on future direction no matter if Ethereum ends up doing verkle or some other form of state commitment.

@onbjerg onbjerg reopened this Sep 18, 2024
@1010adigupta 1010adigupta changed the base branch from main to verkle September 18, 2024 17:21
@1010adigupta
Copy link
Contributor Author

1010adigupta commented Sep 18, 2024

Relaying here -

I'm reopening this because I think this is valuable. However please note verkle is low prio for us right now, so reviews might not be super thorough. I also don't think we want to merge this into main, at least for now, but I'll set up a verkle branch we can merge stuff into - it's ok if this diverges from main, so don't bother rebasing it that much. This will be helpful to see what parts of the node are affected by verkle and can perhaps inform us on future direction no matter if Ethereum ends up doing verkle or some other form of state commitment.

I have changed the base to verkle, please review and merge, I understand the reviews may not be that thorough, but I will try to provide as much info as possible in my prs, for easy reference, each pr will contain tests for all functionality @onbjerg @rkrasiuk

Copy link
Member

@onbjerg onbjerg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions and a nit:)

@@ -387,6 +387,8 @@ tables! {
/// From HashedAddress => NibblesSubKey => Intermediate value
table StoragesTrie<Key = B256, Value = StorageTrieEntry, SubKey = StoredNibblesSubKey>;

/// Stores current state of verkle tree.
table VerkleTrie<Key = Vec<u8>, Value = Vec<u8>>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where specifically? We'd prefer to use Bytes if possible, I glaced at rust-verkle and it seems they have some &[u8] keys in the repo, we can convert these to Bytes. But not necessarily a blocker

Comment on lines +19 to +29
pub struct VerkleDb<'a, TX> {
/// The underlying key value database
// We try to avoid fetching from this, and we only store at the end of a batch insert
pub tx: &'a TX,
/// This stores the key-value pairs that we need to insert into the storage
// This is flushed after every batch insert
pub batch: MemoryDb,
/// This stores the top 3 layers of the trie, since these are the most accessed
// in the trie on average
pub cache: MemoryDb,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand a bit on the intended usage here? I'm asking because generally we avoid having long-lived read transactions open as they lead to bloat because of how MDBX works

Ideally we'd have a way to feed a set of state transitions, a temporary handle to the database and then get the changes we need to apply to the database from this. This is how we do it with the current state commitment impl and it was specifically designed to avoid MDBX pitfalls

Alternatively, instead of holding a tx, you can hold a DatabaseProvider that can give you a RO tx when needed

self.tx.commit().unwrap();

let num_items = self.batch.num_items();
println!("write to batch time: {}, item count : {}", now.elapsed().as_millis(), num_items);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid using println! and use reth-tracing instead like the rest of the codebase, it allows us to filter the logs:) This should e.g. be a debug level trace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, will keep this in mind, adding fix

@1010adigupta
Copy link
Contributor Author

Had to follow this method to get rust-verkle, integrated in reth's disk, so that I could start with execution-spec tests for transition, gas-costs etc(eips basically)
I am parallelly designing the trie-from scratch so that we use reth's existing crates, types and procedure as much as possible

@gakonst
Copy link
Member

gakonst commented Sep 20, 2024

Just echoing Oliver's comments to ensure expectations are set!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants