-
Notifications
You must be signed in to change notification settings - Fork 477
fix(litellm): handle FallbackStreamWrapper in router streaming responses [backport 4.3] #16141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
+1,579
−408
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Backport c40e537 from #15883 to 4.2. Aspect benchmarks are very fast, but the lack of enough values introduces a high variability of the results. This PR increases the number of values for iast aspect benchmarks to obtain a more stable measure. APPSEC-60435 Signed-off-by: Alberto Vara <alberto.vara@datadoghq.com> Co-authored-by: Alberto Vara <alberto.vara@datadoghq.com>
## Description This is a backport of #15962. This issue was introduced in 4.1.
… [backport 4.2] (#15967) Backport 99fe656 from #15961 to 4.2. ## Description <!-- Provide an overview of the change and motivation for the change --> CI run for #15723 ## Testing <!-- Describe your testing strategy or note what tests are included --> ## Risks <!-- Note any risks associated with this change, or "None" if no risks --> ## Additional Notes <!-- Any other information that would be helpful for reviewers --> Co-authored-by: Alexandre Choura <42672104+PROFeNoM@users.noreply.github.com> Co-authored-by: Kian Jones <kian@letta.com> Co-authored-by: Kyle Verhoog <kyle@verhoog.ca> Co-authored-by: Louis Tricot <75956635+dubloom@users.noreply.github.com>
Backport 595fab1 from #15990 to 4.2. ## Description This unflakes `test_multiprocessing` by increasing the sampling rate for it (aligned with other tests). [See test runs](https://app.datadoghq.com/ci/test/runs?query=test_level%3Atest%20%40ci.pipeline.name%3ADataDog%2Fapm-reliability%2Fdd-trace-py%20%40git.branch%3Akowalski%2Ftest-profiling-unflake-test_multiprocessing%20status%3Aerror%20%40ci.job.name%3A%2Aprofile%2A%20%40test.name%3A%2Atest_multiprocessing%2A&agg_m=count&agg_m_source=base&agg_q=%40test.name&agg_q_source=base&agg_t=count&fromUser=false&index=citest&mode=sliding&top_n=100&top_o=top&viz=timeseries&x_missing=true&start=1767623825768&end=1768228625768&paused=false) Co-authored-by: Thomas Kowalski <thomas.kowalski@datadoghq.com>
…ckport 4.2] (#16036) Backport a51047c from #16035 to 4.2. ## Description This PR removes a check added in #11182 to recolor langchain-openai spans as LLM kind if certain kwarg key/value pairs were detected. This has led to duplicate LLM spans representing the same LLM call (and therefore duplicate LLM span count + cost/token metrics + status/error metrics). While we acknowledge there may be langchain-openai traces that might be missing LLM spans if the downstream openai integration is disabled, we've decided it's more important to avoid duplicate LLM span cases due to billing implications. <!-- Provide an overview of the change and motivation for the change --> ## Testing <!-- Describe your testing strategy or note what tests are included --> ## Risks <!-- Note any risks associated with this change, or "None" if no risks --> ## Additional Notes <!-- Any other information that would be helpful for reviewers --> Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
… 4.2] (#16084) Backport 742811e from #16064 to 4.2. ## Description Noticed that our memory profiler flamegraphs are upside down. This fixes that by removing `set_reverse_locations(true)` call from memory profiler traceback. Added a simple regression test which fails without the change and passes with the change. ## Testing <!-- Describe your testing strategy or note what tests are included --> ## Risks <!-- Note any risks associated with this change, or "None" if no risks --> ## Additional Notes <!-- Any other information that would be helpful for reviewers --> Signed-off-by: Taegyun Kim <taegyun.kim@datadoghq.com> Co-authored-by: Taegyun Kim <taegyun.kim@datadoghq.com>
Codeowners resolved as |
Yun-Kim
approved these changes
Jan 21, 2026
…ses (#16037) In litellm>=1.74.15, router streaming responses are wrapped in `FallbackStreamWrapper` (for mid-stream fallback support) which doesn't expose the .handler attribute that was expected. This change adds defensive handling to check for the handler attribute before accessing it. When the handler is not available, the response is wrapped in our own `TracedStream` to ensure spans are properly finished. Also reported here: BerriAI/litellm#13725 --------- Co-authored-by: Yun Kim <yun.kim@datadoghq.com> (cherry picked from commit 4c65feb)
d2e8b9a to
6bfcf15
Compare
taegyunkim
approved these changes
Jan 22, 2026
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 4c65feb from #16037 to 4.3.
In litellm>=1.74.15, router streaming responses are wrapped in
FallbackStreamWrapper(for mid-stream fallback support) which doesn't expose the .handler attribute that was expected.This change adds defensive handling to check for the handler attribute before accessing it. When the handler is not available, the response is wrapped in our own
TracedStreamto ensure spans are properly finished.Also reported here: BerriAI/litellm#13725