Skip to content

Commit

Permalink
eth: make transaction propagation paths in the network deterministic …
Browse files Browse the repository at this point in the history
…(#29034)

commit ethereum/go-ethereum@0b1438c.

Geth currently propagates new transactions by sending them in full to
sqrt(peers) and announcing their hashes to the rest of the peers. The exceptions
are peers that already are known to have the transactions (neither is done for
them) and large/blob transactions (which are always announced). For this PR's
scope, we don't care about the special cases, only the normal new transactions.

The rationale behind the broadcast/announce split is that broadcasting to
everyone in full would be very wasteful, as everyone would in essence receive
the same transactions from all their peers. Announcing it to everyone on the
other hand would minimize traffic, but would maximise distribution latency as
everyone would need to explicitly ask for an announced transaction. Depending on
whatever timeout clients would use, this could lead to multi-second delays in a
single transaction's propagation time. Broadcasting to a few peers and
announcing to everyone else ensures that the transaction ripples through well
connected peers very fast and any degenerate part of the network is covered by
announcements. The ideal ratio of the split between broadcast and announce is
the topic of a different discussion.

The interesting tidbit for this PR is that the split between broadcast and
announce is currently done at random in Geth. We calculate that a new
transaction needs to be sent in full to N peers out of M, and just pick them at
random. This randomness is very much desired as it ensures that the network load
caused by transactions is evenly distributed across all connections. As long as
transactions are arriving at a steady rate from different accounts, this
mechanism works well. It doesn't matter who sends what, we randomly pass it
across the network and everyone will receive it one way or another.

A problem arises however when there is a burst of transactions from the same
account (whether insta-sending K transactions or individually in very quick
succession). The problem is that evaluating whom to send in full and whom to
announce by hash is evaluated randomly, independently across transactions.

With K transactions arriving simultaneously from the same account, those would
get randomly broadcast across our peer set. With a probability of 1, all peers
will receive a sequence of nonce-gapped transactions, the gaps being announced
only. This is a double issue: nodes will only forward executable transactions,
so whenever a peer encounters a nonce gap, propagation will be choked from that
point onward. Even though the gaps are announced, those will be received delayed
(whether filled by someone else or needing explicit retrieval), time by which
the gapped transactions might already be dropped.  The issue is even worse for K
transactions arriving individually in quick succession (say 50ms apart). There
the exact same problem arises, but we can't even try to group transactions by
account because we don't know what we've broadcast before and what future
transactions will arrive. Tracking broadcast targets across time is a non
trivial complexity.  Geth's current solution to this problem is the transaction
pool. In the "legacy" pool, we track two sets of transactions: the pending set,
containing all the executable transactions (no nonce gaps) and the queued set,
containing a mixed bag of everything that's missing a nonce. As time passes and
gaps are filled in, we move queued transactions to pending transactions. Whilst
in theory workable, in practice this constant shuffling makes the pool extremely
brittle and easy to attack. The only way to simplify the pool and make it both
more robust and possibly have a larger capacity is to somehow get rid of this 2
set complexity. For that to happen, we need to fix transaction propagation
somehow to get rid of nonce gaps altogether. Whilst it might be "unfeasible" to
make propagation 100% accurate and thus completely remove the pool's complexity;
if we could make propagation almost-perfect, we could probably also very
agressively simplify the txpool to only track a minimal subset of gaps for
"flukes".

Can we fix transaction propagation though? At least making it
"approximately-correct". This PR is an attempt at saying Yes to that question.
What we would like to achieve is to keep the current performance of transaction
propagation (wrt bandwidth and latency), but avoid the nonce-gap-generation
issue. The only way to do that is to ensure that if a tx is broadcast in full to
a peer, all subsequent txs from the same account are broadcast in full. If on
the other hand the tx is announced, all subsequent transactions are announced.
The naive solution of tracking what we sent to who is a can of worms nobody
wants to open (especially when we would like this mechanism to work across a
longer time frame).

The solution this PR proposes, is to "define" a "semi-stable" transaction
broadcast/announce topology, where every node "knows" to whom they should
broadcast and to whom they should announce, without having a complete view of
the network or the transaction pool. It's ok if this "topology" is not
completely stable, but it should be stable "enough" to capture
semi-instantaneous bursts and keep then on the same propagation path wrt
broadcasts/announce.

Instead of picking sqrt peers at random to broadcast to; or instead of tracking
to whom we've broadcasted before; the PR proposes to hash our own ID with a
peer's ID and with the tx sender and use that checksum to select sqrt(peers) to
broadcast to. The elegance of this algorithm is that as long as I have a
relatively stable number of peers, the same peers will be selected over and over
and over again for broadcasting, independent of what other peers are connected;
and with exactly 0 state tracking. If enough peers join/leave to change the
sqrt(peer) value, the topology will change, but apart from a startup wonkyness,
the connections and pathways will be stable most of the time.

The immediate upside is that nonce gaps should almost completely disappear (the
more other clients also chose to implement this (or any other stable topology,
doesn't have to be the same), the better the stability). With very minimised
nonce gaps, we would be able to drastically simplify the txpool gapped tx
handling since that would be the exception, not the general rule. Also,
important to highlight, this change is essentially free from all perspectives:
computationally 0, complexity wise 0, effort wise to add to Geth or any other
client 0.
  • Loading branch information
karalabe authored and minh-bq committed Sep 16, 2024
1 parent c1d16b8 commit efbc4f7
Show file tree
Hide file tree
Showing 2 changed files with 47 additions and 11 deletions.
1 change: 1 addition & 0 deletions eth/backend.go
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,7 @@ func New(stack *node.Node, config *ethconfig.Config) (*Ethereum, error) {
}
}
if eth.handler, err = newHandler(&handlerConfig{
NodeID: eth.p2pServer.Self().ID(),
Database: chainDb,
Chain: eth.blockchain,
TxPool: eth.txPool,
Expand Down
57 changes: 46 additions & 11 deletions eth/handler.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import (
"github.com/ethereum/go-ethereum/core/txpool"
"github.com/ethereum/go-ethereum/core/types"
"github.com/ethereum/go-ethereum/core/vote"
"github.com/ethereum/go-ethereum/crypto"
"github.com/ethereum/go-ethereum/eth/downloader"
"github.com/ethereum/go-ethereum/eth/fetcher"
"github.com/ethereum/go-ethereum/eth/protocols/eth"
Expand All @@ -40,8 +41,10 @@ import (
"github.com/ethereum/go-ethereum/log"
"github.com/ethereum/go-ethereum/metrics"
"github.com/ethereum/go-ethereum/p2p"
"github.com/ethereum/go-ethereum/p2p/enode"
"github.com/ethereum/go-ethereum/params"
"github.com/ethereum/go-ethereum/trie"
"golang.org/x/crypto/sha3"
)

const (
Expand Down Expand Up @@ -85,6 +88,7 @@ type txPool interface {
// handlerConfig is the collection of initialization parameters to create a full
// node network handler.
type handlerConfig struct {
NodeID enode.ID // P2P node ID used for tx propagation topology
Database ethdb.Database // Database for direct sync insertions
Chain *core.BlockChain // Blockchain to serve data from
TxPool txPool // Transaction pool to propagate from
Expand All @@ -99,6 +103,7 @@ type handlerConfig struct {
}

type handler struct {
nodeID enode.ID
networkID uint64
forkFilter forkid.Filter // Fork ID filter, constant across the lifetime of the node

Expand Down Expand Up @@ -149,6 +154,7 @@ func newHandler(config *handlerConfig) (*handler, error) {
config.EventMux = new(event.TypeMux) // Nicety initialization for tests
}
h := &handler{
nodeID: config.NodeID,
networkID: config.Network,
forkFilter: forkid.NewFilter(config.Chain),
eventMux: config.EventMux,
Expand Down Expand Up @@ -587,25 +593,54 @@ func (h *handler) BroadcastTransactions(txs types.Transactions) {

)
// Broadcast transactions to a batch of peers not knowing about it
for _, tx := range txs {
peers := h.peers.peersWithoutTransaction(tx.Hash())
direct := big.NewInt(int64(math.Sqrt(float64(h.peers.len())))) // Approximate number of peers to broadcast to
if direct.BitLen() == 0 {
direct = big.NewInt(1)
}
total := new(big.Int).Exp(direct, big.NewInt(2), nil) // Stabilise total peer count a bit based on sqrt peers

var numDirect int
var (
signer = types.LatestSignerForChainID(h.chain.Config().ChainID) // Don't care about chain status, we just need *a* sender
hasher = sha3.NewLegacyKeccak256().(crypto.KeccakState)
hash = make([]byte, 32)
)
for _, tx := range txs {
var maybeDirect bool
switch {
case tx.Type() == types.BlobTxType:
blobTxs++
case tx.Size() > txMaxBroadcastSize:
largeTxs++
default:
numDirect = int(math.Sqrt(float64(len(peers))))
maybeDirect = true
}
// Send the tx unconditionally to a subset of our peers
for _, peer := range peers[:numDirect] {
txset[peer] = append(txset[peer], tx.Hash())
}
// For the remaining peers, send announcement only
for _, peer := range peers[numDirect:] {
annos[peer] = append(annos[peer], tx.Hash())
// Send the transaction (if it's small enough) directly to a subset of
// the peers that have not received it yet, ensuring that the flow of
// transactions is groupped by account to (try and) avoid nonce gaps.
//
// To do this, we hash the local enode IW with together with a peer's
// enode ID together with the transaction sender and broadcast if
// `sha(self, peer, sender) mod peers < sqrt(peers)`.
for _, peer := range h.peers.peersWithoutTransaction(tx.Hash()) {
var broadcast bool
if maybeDirect {
hasher.Reset()
hasher.Write(h.nodeID.Bytes())
hasher.Write(peer.Node().ID().Bytes())

from, _ := types.Sender(signer, tx) // Ignore error, we only use the addr as a propagation target splitter
hasher.Write(from.Bytes())

hasher.Read(hash)
if new(big.Int).Mod(new(big.Int).SetBytes(hash), total).Cmp(direct) < 0 {
broadcast = true
}
}
if broadcast {
txset[peer] = append(txset[peer], tx.Hash())
} else {
annos[peer] = append(annos[peer], tx.Hash())
}
}
}
for peer, hashes := range txset {
Expand Down

0 comments on commit efbc4f7

Please sign in to comment.