Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tdx: cargo flag --with-perf-tools does not include perf tooling in the ramdisk #357

Open
babayet2 opened this issue Nov 18, 2024 · 6 comments
Assignees
Labels
tdx TDX specific bugs or features

Comments

@babayet2
Copy link

Scenario
Building CVMs with perf tooling for evaluating OHCL TDX performance

Expected Behavior
Running the below command should produce an initrd which contains the "perf" and "tracing" tools, which we can leverage to get kernel (and user?) stacks.
cargo xflowey build-igvm x64-cvm-devkern --release --with-perf-tools --override-manifest=./vm/loader/manifests/openhcl-x64-cvm-dev.json

Observed Behavior
We must modify openhcl's rootfs.config to include the perf and tracing binaries, a build a custom kernel with CONFIG_BPF_SYSCALL. With this workaround we are able to get flamegraphs of the kernel stacks, but we have not found a build configuration that allows us to get the more critical user stacks.

@chris-oo chris-oo added the tdx TDX specific bugs or features label Nov 18, 2024
@chris-oo
Copy link
Member

@jaredwhitedev maybe knows some hints here?

@jaredwhitedev
Copy link
Contributor

Yeah the --with-perf-tools flag didn't work last I checked. If you dump a working perf binary into the running OHCL environment and have debug info enabled (--with-debuginfo), perf record will work as expected though. No kernel config changes needed.

@chris-oo
Copy link
Member

I wonder if we need to remove that flag or just have that flag bundle some stock perf tools image from a known source. Where did you get your "working perf binary"?

I assume we still need to add CONFIG_BPF_SYSCALL for kernel flamegraph support?

@jaredwhitedev
Copy link
Contributor

I don't remember at this point, I think I built it a long time ago. Important bit is that it is statically linked. I believe we already bundle the perf binary with the kernel build, I see it here: flowey-persist/flowey_lib_hvlite__download_openhcl_kernel_package/extracted/Microsoft.OHCL.Kernel.6.6.51.7-main-cvm-x64.tar.gz/tools/perf/bin/perf. The --with-perf-tools flag is probably just broken.

I haven't needed CONFIG_BPF_SYSCALL for kernel and user stack sampling with perf.

@babayet2
Copy link
Author

@jaredwhitedev it's likely that kernel config is a red herring, it's just to get rid of the "Couldn't synthesize bpf events" error when running perf.

Building --with-debuginfo does not change the output, I'm still only seeing kernel stacks in the flamegraph (attached an example). Could my record command be wrong?
.\ohcldiag-dev.exe ohcl_linux run -- perf record -F 999 -a --call-graph=dwarf -o ./openhcl.perf -- sleep 15
Image

@jaredwhitedev
Copy link
Contributor

Yeah, I agree on the kernel config front.

What command are you using to view/process the perf.data? It looks like Brendan Gregg's flramegraph scripts? Are you running perf script inside OHCL?

For the record, I can normally see user-space stacks by executing the following inside OHCL (via the shell command):

perf record -a -g -- sleep 1
perf report

As a quick correctness test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tdx TDX specific bugs or features
Projects
None yet
Development

No branches or pull requests

3 participants