Skip to content

Commit

Permalink
Merge bitcoin/bitcoin#26593: tracing: Only prepare tracepoint argumen…
Browse files Browse the repository at this point in the history
…ts when actually tracing

0de3e96 tracing: use bitcoind pid in bcc tracing examples (0xb10c)
411c6cf tracing: only prepare tracepoint args if attached (0xb10c)
d524c1e tracing: dedup TRACE macros & rename to TRACEPOINT (0xb10c)

Pull request description:

  Currently, if the tracepoints are compiled (e.g. in depends and release builds), we always prepare the tracepoint arguments regardless of the tracepoints being used or not. We made sure that the argument preparation is as cheap as possible, but we can almost completely eliminate any overhead for users not interested in the tracepoints (the vast majority), by gating the tracepoint argument preparation with an `if(something is attached to this tracepoint)`. To achieve this, we use the optional semaphore feature provided by SystemTap.

  The first commit simplifies and deduplicates our tracepoint macros from 13 TRACEx macros to a single TRACEPOINT macro. This makes them easier to use and also avoids more duplicate macro definitions in the second commit.

  The Linux tracing tools I'm aware of (bcc, bpftrace, libbpf, and systemtap) all support the semaphore gating feature. Thus, all existing tracepoints and their argument preparation is gated in the second commit. For details, please refer to the commit messages and the updated documentation in `doc/tracing.md`.

  Also adding unit tests that include all tracepoint macros to make sure there are no compiler problems with them (e.g. some varadiac extension not supported).

  Reviewers might want to check:
  - Do the tracepoints still work for you? Do the examples in `contrib/tracing/` run on your system (as bpftrace frequently breaks on every new version, please test master too if it should't work for you)? Do the CI interface tests still pass?
  - Is the new documentation clear?
  - The `TRACEPOINT_SEMAPHORE(event, context)` macros places global variables in our global namespace. Is this something we strictly want to avoid or maybe move to all `TRACEPOINT_SEMAPHORE`s to a separate .cpp file or even namespace? I like having the `TRACEPOINT_SEMAPHORE()` in same file as the `TRACEPOINT()`, but open for suggestion on alternative approaches.
  - Are newly added tracepoints in the unit tests visible when using `readelf -n build/src/test/test_bitcoin`? You can run the new unit tests with `./build/src/test/test_bitcoin --run_test=util_trace_tests* --log_level=all`.
  <details><summary>Two of the added unit tests demonstrate that we are only processing the tracepoint arguments when attached by having a test-failure condition in the tracepoint argument preparation. The following bpftrace script can be used to demonstrate that the tests do indeed fail when attached to the tracepoints.</summary>

  `fail_tests.bt`:

  ```c
  #!/usr/bin/env bpftrace

  usdt:./build/src/test/test_bitcoin:test:check_if_attached {
    printf("the 'check_if_attached' test should have failed\n");
  }

  usdt:./build/src/test/test_bitcoin:test:expensive_section {
    printf("the 'expensive_section' test should have failed\n");
  }
  ```

  Run the unit tests with `./build/src/test/test_bitcoin` and start `bpftrace fail_tests.bt -p $(pidof test_bitcoin)` in a separate terminal. The unit tests should fail with:

  ```
  Running 594 test cases...
  test/util_trace_tests.cpp(31): error: in "util_trace_tests/test_tracepoint_check_if_attached": check false has failed
  test/util_trace_tests.cpp(51): error: in "util_trace_tests/test_tracepoint_manual_tracepoint_active_check": check false has failed

  *** 2 failures are detected in the test module "Bitcoin Core Test Suite"
  ```

  </details>

  These links might provide more contextual information for reviewers:
  - [How SystemTap Userspace Probes Work by eklitzke](https://eklitzke.org/how-sytemtap-userspace-probes-work) (actually an example on Bitcoin Core; mentions that with semaphores "the overhead for an untraced process is effectively zero.")
  - [libbpf comment on USDT semaphore handling](https://github.com/libbpf/libbpf/blob/1596a09b5de2a50ab8d44218fc29b6d42f886305/src/usdt.c#L83-L92) (can recommend the whole comment for background on how the tracepoints and tracing tools work together)
  - https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation#Semaphore_Handling

ACKs for top commit:
  willcl-ark:
    utACK 0de3e96
  laanwj:
    re-ACK 0de3e96
  jb55:
    utACK 0de3e96
  vasild:
    ACK 0de3e96

Tree-SHA512: 0e5e0dc5e0353beaf5c446e4be03d447e64228b1be71ee9972fde1d6fac3fac71a9d73c6ce4fa68975f87db2b2bf6eee2009921a2a145e24d83a475d007a559b
  • Loading branch information
fanquake committed Nov 11, 2024
2 parents 0903ce8 + 0de3e96 commit 19f2777
Show file tree
Hide file tree
Showing 16 changed files with 222 additions and 119 deletions.
7 changes: 5 additions & 2 deletions cmake/module/FindUSDT.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,16 @@ if(USDT_INCLUDE_DIR)
include(CheckCXXSourceCompiles)
set(CMAKE_REQUIRED_INCLUDES ${USDT_INCLUDE_DIR})
check_cxx_source_compiles("
// Setting SDT_USE_VARIADIC lets systemtap (sys/sdt.h) know that we want to use
// the optional variadic macros to define tracepoints.
#define SDT_USE_VARIADIC 1
#include <sys/sdt.h>
int main()
{
DTRACE_PROBE(context, event);
STAP_PROBEV(context, event);
int a, b, c, d, e, f, g;
DTRACE_PROBE7(context, event, a, b, c, d, e, f, g);
STAP_PROBEV(context, event, a, b, c, d, e, f, g);
}
" HAVE_USDT_H
)
Expand Down
8 changes: 4 additions & 4 deletions contrib/tracing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ about the connection. Peers can be selected individually to view recent P2P
messages.

```
$ python3 contrib/tracing/p2p_monitor.py ./build/src/bitcoind
$ python3 contrib/tracing/p2p_monitor.py $(pidof bitcoind)
```

Lists selectable peers and traffic and connection information.
Expand Down Expand Up @@ -150,7 +150,7 @@ lost. BCC prints: `Possibly lost 2 samples` on lost messages.


```
$ python3 contrib/tracing/log_raw_p2p_msgs.py ./build/src/bitcoind
$ python3 contrib/tracing/log_raw_p2p_msgs.py $(pidof bitcoind)
```

```
Expand Down Expand Up @@ -241,7 +241,7 @@ A BCC Python script to log the UTXO cache flushes. Based on the
`utxocache:flush` tracepoint.

```bash
$ python3 contrib/tracing/log_utxocache_flush.py ./build/src/bitcoind
$ python3 contrib/tracing/log_utxocache_flush.py $(pidof bitcoind)
```

```
Expand Down Expand Up @@ -300,7 +300,7 @@ comprising a timestamp along with all event data available via the event's
tracepoint.

```console
$ python3 contrib/tracing/mempool_monitor.py ./build/src/bitcoind
$ python3 contrib/tracing/mempool_monitor.py $(pidof bitcoind)
```

```
Expand Down
13 changes: 7 additions & 6 deletions contrib/tracing/log_raw_p2p_msgs.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,8 +132,9 @@ def print_message(event, inbound):
)


def main(bitcoind_path):
bitcoind_with_usdts = USDT(path=str(bitcoind_path))
def main(pid):
print(f"Hooking into bitcoind with pid {pid}")
bitcoind_with_usdts = USDT(pid=int(pid))

# attaching the trace functions defined in the BPF program to the tracepoints
bitcoind_with_usdts.enable_probe(
Expand Down Expand Up @@ -176,8 +177,8 @@ def handle_outbound(_, data, size):


if __name__ == "__main__":
if len(sys.argv) < 2:
print("USAGE:", sys.argv[0], "path/to/bitcoind")
if len(sys.argv) != 2:
print("USAGE:", sys.argv[0], "<pid of bitcoind>")
exit()
path = sys.argv[1]
main(path)
pid = sys.argv[1]
main(pid)
13 changes: 7 additions & 6 deletions contrib/tracing/log_utxocache_flush.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,9 @@ def print_event(event):
))


def main(bitcoind_path):
bitcoind_with_usdts = USDT(path=str(bitcoind_path))
def main(pid):
print(f"Hooking into bitcoind with pid {pid}")
bitcoind_with_usdts = USDT(pid=int(pid))

# attaching the trace functions defined in the BPF program
# to the tracepoints
Expand Down Expand Up @@ -99,9 +100,9 @@ def handle_flush(_, data, size):


if __name__ == "__main__":
if len(sys.argv) < 2:
print("USAGE: ", sys.argv[0], "path/to/bitcoind")
if len(sys.argv) != 2:
print("USAGE: ", sys.argv[0], "<pid of bitcoind>")
exit(1)

path = sys.argv[1]
main(path)
pid = sys.argv[1]
main(pid)
11 changes: 6 additions & 5 deletions contrib/tracing/mempool_monitor.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,8 +114,9 @@
"""


def main(bitcoind_path):
bitcoind_with_usdts = USDT(path=str(bitcoind_path))
def main(pid):
print(f"Hooking into bitcoind with pid {pid}")
bitcoind_with_usdts = USDT(pid=int(pid))

# attaching the trace functions defined in the BPF program
# to the tracepoints
Expand Down Expand Up @@ -365,8 +366,8 @@ def timestamp_age(timestamp):

if __name__ == "__main__":
if len(sys.argv) < 2:
print("USAGE: ", sys.argv[0], "path/to/bitcoind")
print("USAGE: ", sys.argv[0], "<pid of bitcoind>")
exit(1)

path = sys.argv[1]
main(path)
pid = sys.argv[1]
main(pid)
22 changes: 14 additions & 8 deletions contrib/tracing/p2p_monitor.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,9 @@
# outbound P2P messages. The eBPF program submits the P2P messages to
# this script via a BPF ring buffer.

import sys
import curses
import os
import sys
from curses import wrapper, panel
from bcc import BPF, USDT

Expand Down Expand Up @@ -115,10 +116,10 @@ def add_message(self, message):
self.total_outbound_msgs += 1


def main(bitcoind_path):
def main(pid):
peers = dict()

bitcoind_with_usdts = USDT(path=str(bitcoind_path))
print(f"Hooking into bitcoind with pid {pid}")
bitcoind_with_usdts = USDT(pid=int(pid))

# attaching the trace functions defined in the BPF program to the tracepoints
bitcoind_with_usdts.enable_probe(
Expand Down Expand Up @@ -245,9 +246,14 @@ def render(screen, peers, cur_list_pos, scroll, ROWS_AVALIABLE_FOR_LIST, info_pa
(msg.msg_type, msg.size), curses.A_NORMAL)


def running_as_root():
return os.getuid() == 0

if __name__ == "__main__":
if len(sys.argv) < 2:
print("USAGE:", sys.argv[0], "path/to/bitcoind")
if len(sys.argv) != 2:
print("USAGE:", sys.argv[0], "<pid of bitcoind>")
exit()
path = sys.argv[1]
main(path)
if not running_as_root():
print("You might not have the privileges required to hook into the tracepoints!")
pid = sys.argv[1]
main(pid)
94 changes: 49 additions & 45 deletions doc/tracing.md
Original file line number Diff line number Diff line change
Expand Up @@ -265,42 +265,52 @@ Arguments passed:

## Adding tracepoints to Bitcoin Core

To add a new tracepoint, `#include <util/trace.h>` in the compilation unit where
the tracepoint is inserted. Use one of the `TRACEx` macros listed below
depending on the number of arguments passed to the tracepoint. Up to 12
arguments can be provided. The `context` and `event` specify the names by which
the tracepoint is referred to. Please use `snake_case` and try to make sure that
the tracepoint names make sense even without detailed knowledge of the
implementation details. Do not forget to update the tracepoint list in this
document.

```c
#define TRACE(context, event)
#define TRACE1(context, event, a)
#define TRACE2(context, event, a, b)
#define TRACE3(context, event, a, b, c)
#define TRACE4(context, event, a, b, c, d)
#define TRACE5(context, event, a, b, c, d, e)
#define TRACE6(context, event, a, b, c, d, e, f)
#define TRACE7(context, event, a, b, c, d, e, f, g)
#define TRACE8(context, event, a, b, c, d, e, f, g, h)
#define TRACE9(context, event, a, b, c, d, e, f, g, h, i)
#define TRACE10(context, event, a, b, c, d, e, f, g, h, i, j)
#define TRACE11(context, event, a, b, c, d, e, f, g, h, i, j, k)
#define TRACE12(context, event, a, b, c, d, e, f, g, h, i, j, k, l)
```
Use the `TRACEPOINT` macro to add a new tracepoint. If not yet included, include
`util/trace.h` (defines the tracepoint macros) with `#include <util/trace.h>`.
Each tracepoint needs a `context` and an `event`. Please use `snake_case` and
try to make sure that the tracepoint names make sense even without detailed
knowledge of the implementation details. You can pass zero to twelve arguments
to the tracepoint. Each tracepoint also needs a global semaphore. The semaphore
gates the tracepoint arguments from being processed if we are not attached to
the tracepoint. Add a `TRACEPOINT_SEMAPHORE(context, event)` with the `context`
and `event` of your tracepoint in the top-level namespace at the beginning of
the file. Do not forget to update the tracepoint list in this document.

For example, the `net:outbound_message` tracepoint in `src/net.cpp` with six
arguments.

For example:
```C++
// src/net.cpp
TRACEPOINT_SEMAPHORE(net, outbound_message);
void CConnman::PushMessage(…) {
TRACEPOINT(net, outbound_message,
pnode->GetId(),
pnode->m_addr_name.c_str(),
pnode->ConnectionTypeAsString().c_str(),
sanitizedType.c_str(),
msg.data.size(),
msg.data.data()
);
}
```
If needed, an extra `if (TRACEPOINT_ACTIVE(context, event)) {...}` check can be
used to prepare somewhat expensive arguments right before the tracepoint. While
the tracepoint arguments are only prepared when we attach something to the
tracepoint, an argument preparation should never hang the process. Hashing and
serialization of data structures is probably fine, a `sleep(10s)` not.
```C++
TRACE6(net, inbound_message,
pnode->GetId(),
pnode->m_addr_name.c_str(),
pnode->ConnectionTypeAsString().c_str(),
sanitizedType.c_str(),
msg.data.size(),
msg.data.data()
);
// An example tracepoint with an expensive argument.
TRACEPOINT_SEMAPHORE(example, gated_expensive_argument);
if (TRACEPOINT_ACTIVE(example, gated_expensive_argument)) {
expensive_argument = expensive_calulation();
TRACEPOINT(example, gated_expensive_argument, expensive_argument);
}
```

### Guidelines and best practices
Expand All @@ -318,12 +328,6 @@ the tracepoint. See existing examples in [contrib/tracing/].

[contrib/tracing/]: ../contrib/tracing/

#### No expensive computations for tracepoints
Data passed to the tracepoint should be inexpensive to compute. Although the
tracepoint itself only has overhead when enabled, the code to compute arguments
is always run - even if the tracepoint is not used. For example, avoid
serialization and parsing.

#### Semi-stable API
Tracepoints should have a semi-stable API. Users should be able to rely on the
tracepoints for scripting. This means tracepoints need to be documented, and the
Expand All @@ -347,7 +351,7 @@ first six argument fields. Binary data can be placed in later arguments. The BCC
supports reading from all 12 arguments.

#### Strings as C-style String
Generally, strings should be passed into the `TRACEx` macros as pointers to
Generally, strings should be passed into the `TRACEPOINT` macros as pointers to
C-style strings (a null-terminated sequence of characters). For C++
`std::strings`, [`c_str()`] can be used. It's recommended to document the
maximum expected string size if known.
Expand All @@ -370,9 +374,9 @@ $ gdb ./build/src/bitcoind
(gdb) info probes
Type Provider Name Where Semaphore Object
stap net inbound_message 0x000000000014419e /build/src/bitcoind
stap net outbound_message 0x0000000000107c05 /build/src/bitcoind
stap validation block_connected 0x00000000002fb10c /build/src/bitcoind
stap net inbound_message 0x000000000014419e 0x0000000000d29bd2 /build/src/bitcoind
stap net outbound_message 0x0000000000107c05 0x0000000000d29bd0 /build/src/bitcoind
stap validation block_connected 0x00000000002fb10c 0x0000000000d29bd8 /build/src/bitcoind
```

Expand All @@ -388,7 +392,7 @@ Displaying notes found in: .note.stapsdt
stapsdt 0x0000005d NT_STAPSDT (SystemTap probe descriptors)
Provider: net
Name: outbound_message
Location: 0x0000000000107c05, Base: 0x0000000000579c90, Semaphore: 0x0000000000000000
Location: 0x0000000000107c05, Base: 0x0000000000579c90, Semaphore: 0x0000000000d29bd0
Arguments: -8@%r12 8@%rbx 8@%rdi 8@192(%rsp) 8@%rax 8@%rdx
```
Expand All @@ -407,7 +411,7 @@ between distributions. For example, on

```
$ tplist -l ./build/src/bitcoind -v
b'net':b'outbound_message' [sema 0x0]
b'net':b'outbound_message' [sema 0xd29bd0]
1 location(s)
6 argument(s)
Expand Down
10 changes: 7 additions & 3 deletions src/coins.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@
#include <random.h>
#include <util/trace.h>

TRACEPOINT_SEMAPHORE(utxocache, add);
TRACEPOINT_SEMAPHORE(utxocache, spent);
TRACEPOINT_SEMAPHORE(utxocache, uncache);

std::optional<Coin> CCoinsView::GetCoin(const COutPoint& outpoint) const { return std::nullopt; }
uint256 CCoinsView::GetBestBlock() const { return uint256(); }
std::vector<uint256> CCoinsView::GetHeadBlocks() const { return std::vector<uint256>(); }
Expand Down Expand Up @@ -97,7 +101,7 @@ void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possi
it->second.coin = std::move(coin);
it->second.AddFlags(CCoinsCacheEntry::DIRTY | (fresh ? CCoinsCacheEntry::FRESH : 0), *it, m_sentinel);
cachedCoinsUsage += it->second.coin.DynamicMemoryUsage();
TRACE5(utxocache, add,
TRACEPOINT(utxocache, add,
outpoint.hash.data(),
(uint32_t)outpoint.n,
(uint32_t)it->second.coin.nHeight,
Expand Down Expand Up @@ -131,7 +135,7 @@ bool CCoinsViewCache::SpendCoin(const COutPoint &outpoint, Coin* moveout) {
CCoinsMap::iterator it = FetchCoin(outpoint);
if (it == cacheCoins.end()) return false;
cachedCoinsUsage -= it->second.coin.DynamicMemoryUsage();
TRACE5(utxocache, spent,
TRACEPOINT(utxocache, spent,
outpoint.hash.data(),
(uint32_t)outpoint.n,
(uint32_t)it->second.coin.nHeight,
Expand Down Expand Up @@ -278,7 +282,7 @@ void CCoinsViewCache::Uncache(const COutPoint& hash)
CCoinsMap::iterator it = cacheCoins.find(hash);
if (it != cacheCoins.end() && !it->second.IsDirty() && !it->second.IsFresh()) {
cachedCoinsUsage -= it->second.coin.DynamicMemoryUsage();
TRACE5(utxocache, uncache,
TRACEPOINT(utxocache, uncache,
hash.hash.data(),
(uint32_t)hash.n,
(uint32_t)it->second.coin.nHeight,
Expand Down
4 changes: 3 additions & 1 deletion src/net.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@
#include <optional>
#include <unordered_map>

TRACEPOINT_SEMAPHORE(net, outbound_message);

/** Maximum number of block-relay-only anchor connections */
static constexpr size_t MAX_BLOCK_RELAY_ONLY_ANCHORS = 2;
static_assert (MAX_BLOCK_RELAY_ONLY_ANCHORS <= static_cast<size_t>(MAX_BLOCK_RELAY_ONLY_CONNECTIONS), "MAX_BLOCK_RELAY_ONLY_ANCHORS must not exceed MAX_BLOCK_RELAY_ONLY_CONNECTIONS.");
Expand Down Expand Up @@ -3811,7 +3813,7 @@ void CConnman::PushMessage(CNode* pnode, CSerializedNetMsg&& msg)
CaptureMessage(pnode->addr, msg.m_type, msg.data, /*is_incoming=*/false);
}

TRACE6(net, outbound_message,
TRACEPOINT(net, outbound_message,
pnode->GetId(),
pnode->m_addr_name.c_str(),
pnode->ConnectionTypeAsString().c_str(),
Expand Down
4 changes: 3 additions & 1 deletion src/net_processing.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,8 @@

using namespace util::hex_literals;

TRACEPOINT_SEMAPHORE(net, inbound_message);

/** Headers download timeout.
* Timeout = base + per_header * (expected number of headers) */
static constexpr auto HEADERS_DOWNLOAD_TIMEOUT_BASE = 15min;
Expand Down Expand Up @@ -4969,7 +4971,7 @@ bool PeerManagerImpl::ProcessMessages(CNode* pfrom, std::atomic<bool>& interrupt
CNetMessage& msg{poll_result->first};
bool fMoreWork = poll_result->second;

TRACE6(net, inbound_message,
TRACEPOINT(net, inbound_message,
pfrom->GetId(),
pfrom->m_addr_name.c_str(),
pfrom->ConnectionTypeAsString().c_str(),
Expand Down
1 change: 1 addition & 0 deletions src/test/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ add_executable(test_bitcoin
util_string_tests.cpp
util_tests.cpp
util_threadnames_tests.cpp
util_trace_tests.cpp
validation_block_tests.cpp
validation_chainstate_tests.cpp
validation_chainstatemanager_tests.cpp
Expand Down
Loading

0 comments on commit 19f2777

Please sign in to comment.