-
Notifications
You must be signed in to change notification settings - Fork 95
Validate Database Snapshot
When creating a new node, it is recommended to start with an existing snapshot of the database.
There are two types of databases:
- The chain database (stored in the
$DBPATH/0/rocksDb
) - The pact databases (one for each chain, stored in
$DBPATH/0/sqlite
)
If the origin of snapshot is not fully trusted, the snapshot should be validated. Validation consists of two parts:
- Validate the the Merkle tree of the chain database,
- Rebuild the pact databases from scratch.
This can be done as follows with the following steps:
- delete all sqlite db files from database directory before starting the node,
- run the node with
--prune-chain-database=headers-checked --only-sync-pact --validateHashesOnReplay
.
On startup the node will first validate all Merkel Hashes (this takes about between 1h and 3h, depending on hardware), and then rebuild the pact databases (takes up to a day, depending on hardware). After that the db is fully validated and the --prune-chain-database
flag can be omitted when restarting the node.
It is also recommended to configure the node so that it doesn't connect to the network during db validation. (use --enable-private
, --enable-ignore-bootstrap-nodes
, --disable-mempool-p2p
.)
To speed up the process, ideally, one should use a machine with SSD disks and more than 22 fast CPU cores, or at least as close to 22 cores as possible. (It can be done a bit faster by putting the dbs on a RAM-disk of about 160GB, but then one has to make sure to persist the resulting databases before rebooting the machine).
Validation takes different time for different chains. In particular chain 0-3 takes much longer than the remaining chains.
Configure the location of your database (must match whatever is specified in the chainweb node configuration file or provide via --database-directory
).
DBDIR=${DBDIR:-$HOME/.local/share/chainweb/}
The following command synchronizes a snapshot of the chain database. It does not download the pact databases, because those are deleted anyways for validation steps.
Pick a chainweb node that is geographically close to you for SERVER
(note that only some bootstrap nodes provide rsync
access to the db`, if possible use a different source).
SERVER=${SERVER:-us-e3.chainweb.com}
rsync -avz --delete "${SERVER}::db/0/rocksDb" "$DBDIR/0/rocksDb"
Delete Pact database:
rm -rf "$DBDIR"/0/sqlite
Create configuration file for Pact database replay. With this configuration the chainweb-node will not connect to the network.
cat > validate-mainnet-db.yaml <<EOF
databaseDirectory: "$DBDIR"
chainweb:
# stop node after finishing replay of pact databases
onlySyncPact: true
# this confirms the Merkle hash of the resulting payload with the respective
# hash that is stored in the chain database.
validateHashesOnReplay: true
p2p:
peer:
interface: localhost
hostaddress:
port: 4445
private: true
ignoreBootstrapNodes: true
transactionIndex:
enabled: false
mempoolP2p:
enabled: false
cuts:
# The following causes full Merkle Tree validation on startup.
# To only rebuild the Pact db set this to 'none'
pruneChainDatabase: headers-checked
logging:
telemetryBackend:
enabled: false
configuration:
color: 'true'
backend:
color: 'true'
filter:
rules:
- value: mempool
key: component
level: warn
- value: connection-manager
key: component
level: warn
- value: sync
key: sub-component
level: warn
- value: MempoolAccess
key: sub-component
level: warn
- value: PactService
key: logger
level: warn
- value: Checkpointer
key: logger
level: warn
default: info
EOF
Validate Chainweb database Merkle tree and rebuild Pact database:
chainweb-node --config-file=validate-mainnet-db.yaml
After the node terminates look out for the following log message: finished synchronizing Pact DBs
The Chainweb database consists of two parts:
-
The chain database, which is stored in a rocksdb database and which contains
- block headers
- raw block payloads
- cached block outputs
All this data is organized in a Merkle DAG which has the headers in the current cut as roots.
-
The Pact database, which is stored in a sqlite database and which contains the Pact state.
There two aspects of validating the consistency of a Chainweb database:
- Pact state consistency
- Consistency of the Chain Merkle Tree.
On startup, chainweb-node synchronizes the Pact state for each chain to the state of the chain database. For this the system identifies the block which corresponds with the latest Pact state. It then rewinds the Pact state to the largest common ancestor of this block and the most recent block in the chain database of the respective chain. After that is replays that transactions of all blocks up to the most recent block on the respective chain.
An empty or non-existent Pact database corresponds to the genesis header. Therefore, if the Pact database is missing, chainweb-node rebuilds the Pact database from scratch by replaying the transactions of all blocks up to the most recent block.
It is thus possible to rebuild the Pact database of an untrusted database simply by deleting the sqlite directory from the chainweb database directory before starting chainweb-node.
Time: several hours, up to a day.
Chainweb-node allows to validate the chain Merkle tree while pruning the chainweb database. This can be enabled with the following command line option:
--prune-chain-database=headers-checked=headers-checked
Depending on the system performance validation of the Merkle tree can take anywhere from 30 min to several hours.
Using --prune-chain-database=headers-checked=none
completely disables database pruning and Merkle tree validation.
Time: one to four hours