Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: litellm.completion retries fail silently without tenacity (not router/proxy) #5690

Open
F1bos opened this issue Sep 13, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@F1bos
Copy link
Contributor

F1bos commented Sep 13, 2024

What happened?

When litellm encounters an error that triggers retries (e.g., model returns an invalid response, network issue), the retry mechanism fails silently if the tenacity library is not installed. The error message only indicates the initial error, without mentioning the missing dependency or the failed retry attempts.

To Reproduce

  1. Run any litellm code that includes retries (e.g., num_retries > 0) without tenacity installed.
  2. Induce an error that would normally trigger a retry (e.g., by providing an invalid prompt or simulating a network issue).
  3. The code fails with the initial error but doesn't mention the missing tenacity dependency or the failed retries.

Expected behavior

The error message should explicitly state that retries failed due to the missing tenacity library, allowing users to easily identify and resolve the issue. Ideally, it should also log the failed retry attempts.

Code Snippet for Validation

import asyncio
import json
import os

import litellm

litellm.enable_json_schema_validation = True
litellm.set_verbose = True # see the raw request made by litellm

os.environ['GEMINI_API_KEY'] = ""

response_schema = {
    "type": "array",
    "items": {
        "type": "string",
    },
}

response_format={
    "type": "json_object",
    "response_schema": response_schema,
    "enforce_validation": True
}

safety_settings=[
    {
        "category": "HARM_CATEGORY_HARASSMENT",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "threshold": "BLOCK_NONE",
    },
]

async def main():
    messages = [
        {
            "role": "user",
            "content": "send me an invalid json"
        }
    ]

    response = await litellm.acompletion(
        model="gemini/gemini-1.5-flash",
        response_format=response_format,
        safety_settings=safety_settings,
        messages=messages,
        num_retries=1,
     )
    
    message = response.choices[0].message
    print('Response: ', response, ' | ', json.loads(message.content))

asyncio.run(main())

Additional context

The relevant code responsible for retries is in the wrapper_async method:

return await litellm.acompletion_with_retries(*args, **kwargs)

However, the dependency on tenacity is only checked within acompletion_with_retries:

litellm/litellm/main.py

Lines 2838 to 2848 in cd8d7ca

async def acompletion_with_retries(*args, **kwargs):
"""
[DEPRECATED]. Use 'acompletion' or router.acompletion instead!
Executes a litellm.completion() with 3 retries
"""
try:
import tenacity
except Exception as e:
raise Exception(
f"tenacity import failed please run `pip install tenacity`. Error{e}"
)

This leads to a silent failure of retries if tenacity is not installed.

Possible Solution

The error handling within wrapper_async should be improved to catch the missing tenacity dependency and provide a more informative error message, including details about the failed retry attempts.

Workaround

Install the tenacity library using pip install tenacity.

Relevant log output

Request to litellm:
litellm.acompletion(model='gemini/gemini-1.5-flash', response_format={'type': 'json_object', 'response_schema': {'type': 'array', 'items': {'type': 'string'}}, 'enforce_validation': True}, safety_settings=[{'category': 'HARM_CATEGORY_HARASSMENT', 'threshold': 'BLOCK_NONE'}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'threshold': 'BLOCK_NONE'}, {'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'threshold': 'BLOCK_NONE'}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'threshold': 'BLOCK_NONE'}], messages=[{'role': 'user', 'content': 'send me an invalid json'}], num_retries=1)


23:56:53 - LiteLLM:WARNING: utils.py:361 - `litellm.set_verbose` is deprecated. Please set `os.environ['LITELLM_LOG'] = 'DEBUG'` for debug logs.
ASYNC kwargs[caching]: False; litellm.cache: None; kwargs.get('cache'): None
Final returned optional params: {'response_mime_type': 'application/json', 'response_schema': {'type': 'array', 'items': {'type': 'string'}}, 'safety_settings': [{'category': 'HARM_CATEGORY_HARASSMENT', 'threshold': 'BLOCK_NONE'}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'threshold': 'BLOCK_NONE'}, {'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'threshold': 'BLOCK_NONE'}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'threshold': 'BLOCK_NONE'}]}


POST Request Sent from LiteLLM:
curl -X POST \
https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=[REDACTED] \
-H 'Content-Type: *****' \
-d '{'contents': [{'role': 'user', 'parts': [{'text': 'send me an invalid json'}, {'text': "Use this JSON schema: \n     \n    {'type': 'array', 'items': {'type': 'string'}}\n    "}]}], 'safetySettings': [{'category': 'HARM_CATEGORY_HARASSMENT', 'threshold': 'BLOCK_NONE'}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'threshold': 'BLOCK_NONE'}, {'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'threshold': 'BLOCK_NONE'}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'threshold': 'BLOCK_NONE'}], 'generationConfig': {'response_mime_type': 'application/json'}}'


RAW RESPONSE:
{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "[\"a\", \"b\", 1, \"d\"]\n"
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP",
      "index": 0,
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "NEGLIGIBLE"
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 37,
    "candidatesTokenCount": 12,
    "totalTokenCount": 49
  }
}



raw model_response: {
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "[\"a\", \"b\", 1, \"d\"]\n"
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP",
      "index": 0,
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "NEGLIGIBLE"
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 37,
    "candidatesTokenCount": 12,
    "totalTokenCount": 49
  }
}

Looking up model=gemini/gemini-1.5-flash in model_cost_map
Traceback (most recent call last):
  File "/home/user/projects/test/.venv/lib/python3.12/site-packages/litellm/litellm_core_utils/json_validation_rule.py", line 24, in validate_schema
    validate(response_dict, schema=schema)
  File "/home/user/projects/test/.venv/lib/python3.12/site-packages/jsonschema/validators.py", line 1332, in validate
    raise error
jsonschema.exceptions.ValidationError: 1 is not of type 'string'

Failed validating 'type' in schema['items']:
    {'type': 'string'}

On instance[2]:
    1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/projects/test/issue_showcase.py", line 63, in <module>
    asyncio.run(main())
  File "/home/user/miniconda3/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/user/projects/test/issue_showcase.py", line 52, in main
    response = await litellm.acompletion(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/projects/test/.venv/lib/python3.12/site-packages/litellm/utils.py", line 1595, in wrapper_async
    raise e
  File "/home/user/projects/test/.venv/lib/python3.12/site-packages/litellm/utils.py", line 1456, in wrapper_async
    post_call_processing(
  File "/home/user/projects/test/.venv/lib/python3.12/site-packages/litellm/utils.py", line 735, in post_call_processing
    raise e
  File "/home/user/projects/test/.venv/lib/python3.12/site-packages/litellm/utils.py", line 727, in post_call_processing
    litellm.litellm_core_utils.json_validation_rule.validate_schema(
  File "/home/user/projects/test/.venv/lib/python3.12/site-packages/litellm/litellm_core_utils/json_validation_rule.py", line 26, in validate_schema
    raise JSONSchemaValidationError(
litellm.exceptions.JSONSchemaValidationError: litellm.APIError: litellm.JSONSchemaValidationError: model=, returned an invalid response=["a", "b", 1, "d"]
, for schema={"type": "array", "items": {"type": "string"}}.
Access raw response with `e.raw_response

Twitter / LinkedIn details

No response

@F1bos F1bos added the bug Something isn't working label Sep 13, 2024
@krrishdholakia krrishdholakia changed the title [Bug]: LiteLLM retries fail silently without tenacity [Bug]: litellm.completion retries fail silently without tenacity (not router/proxy) Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant