Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample for implementing opentelemetry tracing #136

Open
haf opened this issue Aug 4, 2021 · 9 comments
Open

Sample for implementing opentelemetry tracing #136

haf opened this issue Aug 4, 2021 · 9 comments

Comments

@haf
Copy link

haf commented Aug 4, 2021

It would be great to be able to trace our gRPC servers and clients using opentelemetry. How would I tie it into this library?

I would like to have each message handler be called in a context (or passed one) that I can create further child spans from and attach logs to.

@haf haf closed this as completed Aug 7, 2021
@haf haf reopened this Aug 7, 2021
@vmagamedov
Copy link
Owner

Since 0.4.2rc2 there are status-related properties in the SendTrailingMetadata and RecvTrailingMetadata events: https://grpclib.readthedocs.io/en/latest/events.html#grpclib.events.RecvTrailingMetadata

So now it is possible to add tracing by using events system.

@haf
Copy link
Author

haf commented Aug 21, 2021

I’ve started doing this but I have to use contextvars and associate the listeners whenever I start a server; rather than having it in the config of the library

@haf
Copy link
Author

haf commented Aug 21, 2021

Could you possibly provide an example that gets the unary/stream combinations right?

@vmagamedov
Copy link
Owner

vmagamedov commented Aug 21, 2021

associate the listeners whenever I start a server; rather than having it in the config of the library

You can (1) create server instance and attach listeners before (2) starting the server. (1) and (2) can be done in different places. This is almost the same as providing interceptors to other libraries when you create a server:

https://github.com/open-telemetry/opentelemetry-python-contrib/blob/d9c01168716481abac185cc9d5c71462b5722179/instrumentation/opentelemetry-instrumentation-grpc/tests/test_server_interceptor.py#L290-L297

Could you possibly provide an example that gets the unary/stream combinations right?

Can you elaborate? What do you mean by "unary/stream combinations right"?

@vmagamedov
Copy link
Owner

@haf
Copy link
Author

haf commented Aug 21, 2021

Can you elaborate? What do you mean by "unary/stream combinations right"?

Yes, for example, a stream-req-stream-resp would have to create PRODUCE/CONSUME spans rather than CLIENT/SERVER spans, and if one side closes the connection, there might be spans that might be "associated with" a trace, after the fact rather than being sent from the client during runtime (having an explicit parent). Especially so if you're also creating traces for the lifetimes of objects in your app (e.g. I start a new trace when I start the server).

I'm also trying to come to terms with how to manage tracing of python asyncio tasks (again, associated with-type spans?)

There's also the matter of reading request data from the metadata of requests (caller started trace, provides a SpanContext to the server); a sample of how that should be done in gRPC would be a nice addition (and I'll get there, but I haven't investigated this path fully yet).

As for trailing metadata, I haven't been able to find a good resource on this? I've read in the code-docs that it's what is sent as an "ending" to streams? Or can these be sent multiple times during a request/response interaction?

As for the events, I haven't been able to figure out how to capture exceptions using the eventing system? E.g. it's not just about providing a span to the Handler, but also to capture exceptions from it. I've resorted to an explicit get() in the function body of the Handler, because then I can capture stacktraces. Getting an example of how to manage these sort of errors would be nice, including "bad request" thrown from the Handler as a gRPC message.

It would also be interesting to hear if specifically contextvars are the recommended solution? I read you said so in a previous answer to an issue?

I'm not much for monkeypatching, coming from statically typed languages... It's often more complex to debug and relies on implementation details rather than API:s.

@vmagamedov
Copy link
Owner

Adapted opentelemetry-instrumentation-grpc to the helloworld example: https://gist.github.com/vmagamedov/19a29f7a4f8f70d76bbc797a0e994112

Should be enough to understand how to extract request metadata, exceptions, status etc.

This example is just a POC, I still don't understand how attach(extract(event.metadata)) works :) I don't see request metadata in the console (ConsoleSpanExporter), can't test that this actually works.

It would also be interesting to hear if specifically contextvars are the recommended solution? I read you said so in a previous answer to an issue?

Yes, this is how context propagation in Python works. Everyone under the hood use contextvars.

@haf
Copy link
Author

haf commented Oct 18, 2023

Hi @vmagamedov , is this gist still up to date with the best way of doing tracing? :)

For example, isn't this more ideomatic?

image

@vmagamedov
Copy link
Owner

It is definitely not the best way of doing tracing, just a quick proof of concept

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants