-
Notifications
You must be signed in to change notification settings - Fork 110
Profiling Linux Builds with Perf
If you want to profile spring-dedicated, while it is running, there is one good way of doing so, using linux perf tools, which can be installed on most ubuntu/debian based systems easily.
Find out the PID of the spring-dedicated process you wish to attach to. The process seems to have two threads, with approximately equal shares of CPU usage. Find it with
ps aux | grep spring
Remember the process ID of the main thread (generally the smaller one)
- Create a working directory for where you will store your captures.
- Download the debug symbols from the releases page here on github (take care to select linux, and the correct version too).
- Unpack the debug symbols, and make sure the symbols (e.g. spring-dedicated.dbg) are unzipped into your working directory.
Often you will need sudo permissions to perform this capture. perf will warn you otherwise.
Example command:
sudo perf record --pid=3140530 --freq=1000 -m 10M -o spring-dedicated_6cg_lag1.perfoutput --stat -g --call-graph dwarf -e cpu-cycles
Arguments:
- --pid= the process id for spring-dedicated, can pass multiple pids comma separated list. This will sample every thread of the given process.
- --tid= the thread id of an individual thread you wish to sample (rather than the whole process). Typically the main thread has the same tid as the pid of the process.
- --freq= the number of samples per second to take. For higher frequencies or on slower machines, you can set the -m flag to increase perf's memory.
- -m (# | #M) - number of mmap data pages OR size specification with appended unit character. This memory acts as a ring buffer of perf samples the kernel writes, read from by userspace to be written to disk. Larger values can allow high resolution sampling for longer periods before dropping samples.
- -o the name of the output file, if you dont name it, it will overwrite previous captures
- -g enable call graphs (this is implied by below)
- --stat Record per-thread event counts. Use it with perf report -T to
- -e eventtype1,eventtype2 - a comma separated list of events to record. A list of supported events on your system can be found with
perf list
. Typically, we recommend hardware events like cpu-cycles, cache-misses, and branch-misses. Be aware, the more events you specify, the more data you'll save on disk. - --call-graph dwarf : the format of the call graphs. I am unsure if 'fp' or 'dwarf' is the correct argument here, but it seems like dwarf is better.
When "dwarf" recording is used, perf also records (user) stack dump
when sampled. Default size of the stack dump is 8192 (bytes).
User can change the size by passing the size after comma like
"--call-graph dwarf,4096".
When "fp" recording is used, perf tries to save stack enties
up to the number specified in sysctl.kernel.perf_event_max_stack
by default. User can change the number by passing it after comma
like "--call-graph fp,32".
Press CTRL+C to stop the profiler, or send it a SIGTERM signal from code, if you are doing this automatically
- build or install hotspot following steps here: https://github.com/KDAB/hotspot#getting-hotspot
- run hotspot via
$ hotspot
- open the perf data file you saved from
perf record
- Enjoy! Example ui:
sudo perf report -i spring-dedicated_6cg_lag1.perfoutput
Use the '+' key on your keyboard to expand call graphs within perf
Drawing a full-on call graph:
sudo perf report -T -i spring-dedicated_6cg_lag1.perfoutput