Support walking through inner spans in langfuse SDK for e2e trace evaluation #8260

bjornjee · 2025-08-01T08:28:54Z

bjornjee
Aug 1, 2025

Describe the feature or potential improvement

Goal

For integration evals in multi-agent setup, input -> agent_a -> agent_b -> output, I will get a trace with multiple spans for this input-output call. I want a method to use this trace to evaluate each agent's behaviour. Information required: for each span within the trace, we need the function_name, input, output.

Problem

For a multi-step agent setup, the trace contains multiple spans, each from the inner call to LLM provider. We do not have any way to construct or fetch by span_id to evaluate inner spans.

Happy to contribute. I do not have a good proposed solution now. Also open to other solutions to achieve e2e evaluation.

Additional information

No response

marliessophie · 2025-08-01T15:02:55Z

marliessophie
Aug 1, 2025
Maintainer

Hey, thanks for mentioning this! Would being able to run LLM-as-a-judge on an individual span level work? Or do you require a sequence of spans?

1 reply

bjornjee Aug 3, 2025
Author

We will like a sequence of spans to perform evaluations like task completeness / tool correctness metrics by deepeval

wongjingping · 2025-09-15T03:39:12Z

wongjingping
Sep 15, 2025

+1 on having some way to traverse the span tree and access inner spans more easily (e.g. via DFS/BFS). Currently we have to reconstruct the span tree using the flat list of observations, which is still feasible, but feels like additional work.

1 reply

marliessophie Sep 16, 2025
Maintainer

I understand, appreciate the feedback. This is new feature territory; currently we'll need to put a pin in cross-span-level evals.
We will however be adding observation level evaluations soon, so this might create more design space for you?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Langfuse

Support walking through inner spans in langfuse SDK for e2e trace evaluation #8260

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Langfuse

Support walking through inner spans in langfuse SDK for e2e trace evaluation #8260

Uh oh!

bjornjee Aug 1, 2025

Describe the feature or potential improvement

Goal

Problem

Additional information

Replies: 2 comments · 2 replies

Uh oh!

marliessophie Aug 1, 2025 Maintainer

Uh oh!

bjornjee Aug 3, 2025 Author

Uh oh!

wongjingping Sep 15, 2025

Uh oh!

marliessophie Sep 16, 2025 Maintainer

bjornjee
Aug 1, 2025

Replies: 2 comments 2 replies

marliessophie
Aug 1, 2025
Maintainer

bjornjee Aug 3, 2025
Author

wongjingping
Sep 15, 2025

marliessophie Sep 16, 2025
Maintainer