-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
Summary
The Error_handling.ipynb cookbook recommends using @retry.Retry(predicate=retry.if_transient_error) from google-api-core for handling transient errors. However, this predicate does not catch errors raised by google-genai because:
retry.if_transient_errorchecks forgoogle.api_core.exceptionstypes (e.g.,ServiceUnavailable,TooManyRequests)google-genairaises its own error types:google.genai.errors.ClientError(4xx) andgoogle.genai.errors.ServerError(5xx)
These are completely separate exception hierarchies with no inheritance relationship.
Environment
google-genai: 1.56.0google-api-core: 2.25.1- Python: 3.12
Reproduction
1. Verify if_transient_error doesn't catch google-genai errors
from google.api_core import retry
from google.genai import errors
# Test with google.genai.errors (what google-genai actually raises)
test_cases = [
('ClientError 429 (rate limit)', errors.ClientError(429, {'message': 'rate limit'}, None)),
('ServerError 503 (unavailable)', errors.ServerError(503, {'message': 'unavailable'}, None)),
('ServerError 500 (internal)', errors.ServerError(500, {'message': 'internal'}, None)),
]
print('Testing retry.if_transient_error with google.genai.errors:')
print('-' * 60)
for name, exc in test_cases:
result = retry.if_transient_error(exc)
print(f'{name}: {result}')Output:
Testing retry.if_transient_error with google.genai.errors:
------------------------------------------------------------
ClientError 429 (rate limit): False
ServerError 503 (unavailable): False
ServerError 500 (internal): False
All return False - the retry decorator will never trigger for actual google-genai errors.
2. Verify it DOES catch google.api_core.exceptions
from google.api_core import retry, exceptions
# Test with google.api_core.exceptions
test_cases = [
('TooManyRequests', exceptions.TooManyRequests('rate limit')),
('ServiceUnavailable', exceptions.ServiceUnavailable('unavailable')),
('InternalServerError', exceptions.InternalServerError('internal')),
]
print('Testing retry.if_transient_error with google.api_core.exceptions:')
print('-' * 60)
for name, exc in test_cases:
result = retry.if_transient_error(exc)
print(f'{name}: {result}')Output:
Testing retry.if_transient_error with google.api_core.exceptions:
------------------------------------------------------------
TooManyRequests: True
ServiceUnavailable: True
InternalServerError: True
3. Verify the exception hierarchies are separate
from google.genai.errors import ServerError, ClientError, APIError
from google.api_core.exceptions import ResourceExhausted, ServiceUnavailable
print('ServerError bases:', ServerError.__bases__)
print('ResourceExhausted bases:', ResourceExhausted.__bases__)
print()
print('ResourceExhausted is subclass of genai.APIError:', issubclass(ResourceExhausted, APIError))
print('genai.ServerError is subclass of api_core.GoogleAPIError:',
issubclass(ServerError, ServiceUnavailable.__bases__[0]))Output:
ServerError bases: (<class 'google.genai.errors.APIError'>,)
ResourceExhausted bases: (<class 'google.api_core.exceptions.TooManyRequests'>,)
ResourceExhausted is subclass of genai.APIError: False
genai.ServerError is subclass of api_core.GoogleAPIError: False
Why the notebook's test appears to work
In cell-16, the test manually raises google.api_core.exceptions.ServiceUnavailable:
if generate_content_first_fail.call_counter == 1:
raise exceptions.ServiceUnavailable("Service Unavailable") # <- google.api_core.exceptionsThis is caught by if_transient_error. But in production, when google-genai encounters a 503, it raises google.genai.errors.ServerError, not google.api_core.exceptions.ServiceUnavailable.
Suggested Fixes
Option 1: Custom predicate for google-genai errors
from google.api_core import retry
from google.genai import errors
def if_genai_transient_error(exception):
"""Predicate for retrying google-genai transient errors."""
return (
isinstance(exception, errors.APIError)
and exception.code in (408, 429, 500, 502, 503, 504)
)
@retry.Retry(
predicate=if_genai_transient_error,
initial=2.0,
maximum=64.0,
multiplier=2.0,
timeout=600,
)
def generate_with_retry(prompt):
return client.models.generate_content(model=MODEL_ID, contents=prompt)Option 2: Use google-genai's built-in HttpRetryOptions
google-genai has built-in retry support via HttpRetryOptions that correctly handles its own errors:
from google import genai
from google.genai import types
retry_options = types.HttpRetryOptions(
attempts=5,
initial_delay=2.0,
max_delay=64.0,
http_status_codes=[408, 429, 500, 502, 503, 504]
)
client = genai.Client(
api_key=GOOGLE_API_KEY,
http_options=types.HttpOptions(retry_options=retry_options)
)
# Now all calls through this client will automatically retry on transient errors
response = client.models.generate_content(model=MODEL_ID, contents=prompt)This uses tenacity internally with the correct predicate:
# From google/genai/_api_client.py
retry = tenacity.retry_if_exception(
lambda e: isinstance(e, errors.APIError) and e.code in retriable_codes,
)Additional Note
google-genai does not depend on google-api-core (it uses httpx directly), so users who only install google-genai won't have access to google.api_core.retry anyway:
$ pip show google-genai | grep Requires
Requires: anyio, google-auth, httpx, pydantic, requests, tenacity, websockets, ...
(No google-api-core in the dependencies)
Recommendation
Update the Error_handling.ipynb to either:
- Use a custom predicate that checks for
google.genai.errors.APIError - Recommend
HttpRetryOptionsfor the built-in retry mechanism - Fix the test in cell-16 to actually simulate what google-genai raises (use
errors.ServerErrorinstead ofexceptions.ServiceUnavailable)