diff --git a/images/transaction-analysis/.DS_Store b/images/transaction-analysis/.DS_Store new file mode 100644 index 0000000..7a4283c Binary files /dev/null and b/images/transaction-analysis/.DS_Store differ diff --git a/images/transaction-analysis/100-blocks-tx-size.png b/images/transaction-analysis/100-blocks-tx-size.png new file mode 100644 index 0000000..c1a5fbe Binary files /dev/null and b/images/transaction-analysis/100-blocks-tx-size.png differ diff --git a/images/transaction-analysis/length.png b/images/transaction-analysis/length.png new file mode 100644 index 0000000..8ac71ce Binary files /dev/null and b/images/transaction-analysis/length.png differ diff --git a/images/transaction-analysis/mempool-tx-size.png b/images/transaction-analysis/mempool-tx-size.png new file mode 100644 index 0000000..08803d8 Binary files /dev/null and b/images/transaction-analysis/mempool-tx-size.png differ diff --git a/traffic-analysis.md b/traffic-analysis.md new file mode 100644 index 0000000..0b2e248 --- /dev/null +++ b/traffic-analysis.md @@ -0,0 +1,53 @@ +# I still see you! + +## Problem + +Sending transactions across the P2P network is sensitive data. + +![show transaction tracking based on size](./images/transaction-analysis/length.png) + +Global passive observers can differentiate whether it's this tx or that tx from the size of the packet! + +## Statistics + +### Common transaction size (in bytes) in the last 100 blocks +![show common transaction size in last 100 blocks](./images/transaction-analysis/100-blocks-tx-size.png) +- 96.7 % of transactions below 1000 bytes +- 99% of transactions below 3000 bytes + +### Common transaction size (in bytes) in the mempool +![show common transaction size in mempool](./images/transaction-analysis/mempool-tx-size.png) +- 89.6 % of transactions below 1000 bytes +- 92% of transactions below 3000 bytes + +## Solution + +1. Pad transaction messages + +- last 100 blocks + + | | pad to a fixed size=1000 | pad to a fixed size=3000 | pad to a dynamic size | + |--------------------------|--------------------------|--------------------------|-----------------------| + | how much padding? | 1000 | 3000 | Padmé algo | + | what is extra bandwidth? | 43 MB | 162 MB | 10.9 GB | + +- mempool + + | | pad to a fixed size=1000 | pad to a fixed size=3000 | pad to a dynamic size | + |--------------------------|--------------------------|--------------------------|-----------------------| + | how much padding? | 1000 | 3000 | Padmé algo | + | what is extra bandwidth? | 37 MB | 139.8 MB | 123 MB | + +2. Broadcast decoy messages with size (transaction message) + +## Code + +- branch - https://github.com/stratospher/bitcoin/tree/2024-05-tx-traffic +- jupyter notebook - https://colab.research.google.com/drive/1q2fLkxuAiqr2hlAEQ4r6x2Lx9bTh4bN0?usp=sharing + +## Conclusion + +| | advantage | disadvantage | +|-----------------|--------------------------|------------------------------------------------------------------------| +| fixed padding | bandwidth efficient | can't hide the outliers + optimal padding might vary based on data set | +| dynamic padding | covers even the outliers | too much bandwidth |