Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: support for third party OSS to use otel-ebpf-profiler #192

Draft
wants to merge 63 commits into
base: main
Choose a base branch
from

Conversation

amitschendel
Copy link

Hey, in continue to #33 discussion and work from the IG team, we (Kubescape) took some time to do a POC that we think can align well with the project roadmap.
The goal of this PR is to introduce the option to integrate this awesome project as a pkg in other OSS like Kubescape and Inspektor gadget.
In order to do it we added two main things:

  1. Support for kprobes instead of perf_events in order to have the ability to trigger the unwinding capabilities from a tail call when we want a stack trace (e.g when we see syscall xyz we want the stack to see who triggered it).
  2. Support for running from within a container, meaning to support accessing other processes fs.

The flow of a third party OSS project can be:
Grab the main.go and modify it to have the fd of the native_tracer_entry and register a custom reporter instead of the default ones. Then we can do:
I want stack trace->tail_call (native_tracer_entry)->custom_reporter.

First, we would love to have feedback from you on how we can push this to be part of the project and what is missing/need to be changed.
Second, we don't see any releases and so we added support in the Makefile to compile with the EXTERNAL flag which will trigger the compilation of the kprobes instead of the perf_events but we are going to need some sort of release process for both methods I assume so I would love to have your thoughts on it.

In addition the support for native symbols (C/C++/GO etc...) are implemented in your backend software that we didn't find the code to, and so we wonder whether we can have the protocol to talk to your symbol server in some way to be able to resolve native symbols without using the backend software.

Thanks!

amitschendel and others added 19 commits October 10, 2024 09:47
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
Copy link

linux-foundation-easycla bot commented Oct 15, 2024

@fabled
Copy link
Contributor

fabled commented Oct 15, 2024

Hey, in continue to #33 discussion and work from the IG team, we (Kubescape) took some time to do a POC that we think can align well with the project roadmap. The goal of this PR is to introduce the option to integrate this awesome project as a pkg in other OSS like Kubescape and Inspektor gadget. In order to do it we added two main things:

Nice!

Since the project has an enforced squash rule for PRs, it would be useful to have one PR per logical feature instead of one big PR. Could you split this to logical pieces?

1. Support for `kprobes` instead of `perf_events` in order to have the ability to trigger the unwinding capabilities from a tail call when we want a stack trace (e.g when we see syscall xyz we want the stack to see who triggered it).

I have done earlier some private experiments with something similar. Since we are also looking to support off-cpu profiling (see #144). It would make sense to instead compile the eBPF things twice: once as perf_event and another time as kprobe type. We will likely have the kprobe things used by the agent itself. And the same will be useful to be attached as uprobe so the backtrace can be triggered from usermode.

2. Support for running from within a container, meaning to support accessing other processes fs.

The PR looks wrong on this part. There is already support to access the files via the /proc/PID/root by prepending it in the places the file is opened. See:

If something is missing or done wrong, it should be done in the proper abstraction level, and not by adjusting the mappings path.

First, we would love to have feedback from you on how we can push this to be part of the project and what is missing/need to be changed. Second, we don't see any releases and so we added support in the Makefile to compile with the EXTERNAL flag which will trigger the compilation of the kprobes instead of the perf_events but we are going to need some sort of release process for both methods I assume so I would love to have your thoughts on it.

Let's start by splitting this to logical PRs per feature/thing. And remove the EXTERNAL tag by compiling the ebpf twice to produce both variants.

In addition the support for native symbols (C/C++/GO etc...) are implemented in your backend software that we didn't find the code to, and so we wonder whether we can have the protocol to talk to your symbol server in some way to be able to resolve native symbols without using the backend software.

I believe vendors do different things on this part. @christos68k or @florianl can perhaps comment from elastic / devfiler side on what the roadmap/plan is.

amitschendel and others added 7 commits October 15, 2024 12:55
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
Signed-off-by: Amit Schendel <[email protected]>
…ce to go tags instead of Makefile

Signed-off-by: Amit Schendel <[email protected]>
@amitschendel
Copy link
Author

Thanks @fabled I pushed the changes so that running make ebpf compiles both versions and moved the control of the embedding to go tags so it will be easy to choose in what to use. Let me know if this answer your needs.

@afek854
Copy link

afek854 commented Oct 27, 2024

Hey, in addition to the three bullets mentioned by @florianl .
We encountered another challenge while attempting to enhance our monitored events with OTEL eBPF profiler traces.

Currently, our setup monitors both kprobes and tracepoints, but we need a reliable way to uniquely identify events in our monitoring system for correlation with your trace as enrichment data.

To resolve this, we suggest generating unique identifiers for trace events and our events using the following components:

Stack pointer: from task_struct->thread_struct
PID and TID Combination: The combination of the process ID and thread ID.
Syscall Number: The specific syscall being executed.

We would appreciate your feedback on the following:

Is our proposed method for generating unique event identifiers effective?
Are there alternative methods you would recommend for this purpose?

Thank you for your insights!
CC: @Amit Schendel

@florianl
Copy link
Contributor

What's the ETA for #196 ?

There is no ETA for #196, there needs to be more feedback and approvals. First #144 needs to get accepted and merged - feedback is very welcomed. With #196 (comment) some discussion around the design of sampling got started, which needs to be resolved first.

We encountered another challenge while attempting to enhance our monitored events with OTEL eBPF profiler traces.

Currently, our setup monitors both kprobes and tracepoints, but we need a reliable way to uniquely identify events in our monitoring system for correlation with your trace as enrichment data.

The points mentioned in #192 (comment) are just high level points. Being able to distinguish between events that triggered profiling is fundamental, not only for your use case but also for on- vs off-CPU sampling. So there needs to be some way. What information will be used and how it is fetched is up for discussion.

florianl added a commit to florianl/opentelemetry-ebpf-profiler that referenced this pull request Oct 29, 2024
This is the code that backs
open-telemetry#144.
It can be reused to add features like requested in
open-telemetry#33 and
therefore can be an alternative to
open-telemetry#192.

The idea that enables off CPU profiling is, that perf event and kprobe eBPF
programs are quite similar and can be converted. This allows, with the
dynamic rewrite of tail call maps, the reuse of existing eBPF programs and
concepts.

This proposal adds the new flag '-off-cpu-threshold' that enables off CPU
profiling and attaches the two additional hooks, as discussed in Option B
in open-telemetry#144.

Outstanding work:
- [ ] Handle off CPU traces in the reporter package
- [ ] Handle off CPU traces in the user space side

Signed-off-by: Florian Lehner <[email protected]>
florianl added a commit to florianl/opentelemetry-ebpf-profiler that referenced this pull request Nov 1, 2024
This is the code that backs
open-telemetry#144.
It can be reused to add features like requested in
open-telemetry#33 and
therefore can be an alternative to
open-telemetry#192.

The idea that enables off CPU profiling is, that perf event and kprobe eBPF
programs are quite similar and can be converted. This allows, with the
dynamic rewrite of tail call maps, the reuse of existing eBPF programs and
concepts.

This proposal adds the new flag '-off-cpu-threshold' that enables off CPU
profiling and attaches the two additional hooks, as discussed in Option B
in open-telemetry#144.

Outstanding work:
- [ ] Handle off CPU traces in the reporter package
- [ ] Handle off CPU traces in the user space side

Signed-off-by: Florian Lehner <[email protected]>
florianl added a commit to florianl/opentelemetry-ebpf-profiler that referenced this pull request Nov 8, 2024
This is the code that backs
open-telemetry#144.
It can be reused to add features like requested in
open-telemetry#33 and
therefore can be an alternative to
open-telemetry#192.

The idea that enables off CPU profiling is, that perf event and kprobe eBPF
programs are quite similar and can be converted. This allows, with the
dynamic rewrite of tail call maps, the reuse of existing eBPF programs and
concepts.

This proposal adds the new flag '-off-cpu-threshold' that enables off CPU
profiling and attaches the two additional hooks, as discussed in Option B
in open-telemetry#144.

Outstanding work:
- [ ] Handle off CPU traces in the reporter package
- [ ] Handle off CPU traces in the user space side

Signed-off-by: Florian Lehner <[email protected]>
florianl added a commit to florianl/opentelemetry-ebpf-profiler that referenced this pull request Nov 18, 2024
This is the code that backs
open-telemetry#144.
It can be reused to add features like requested in
open-telemetry#33 and
therefore can be an alternative to
open-telemetry#192.

The idea that enables off CPU profiling is, that perf event and kprobe eBPF
programs are quite similar and can be converted. This allows, with the
dynamic rewrite of tail call maps, the reuse of existing eBPF programs and
concepts.

This proposal adds the new flag '-off-cpu-threshold' that enables off CPU
profiling and attaches the two additional hooks, as discussed in Option B
in open-telemetry#144.

Outstanding work:
- [ ] Handle off CPU traces in the reporter package
- [ ] Handle off CPU traces in the user space side

Signed-off-by: Florian Lehner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants