-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broader notion of retryable exception #1174
Comments
Yes, that method should be renamed Sadly it's kind of an empirical question for each provider which errors are in fact retryable. The inspect_ai/src/inspect_ai/_util/retry.py Line 12 in b4b1656
Ideally each provider could apply this heuristic + whatever custom heuristics are required. The current as you no doubt noted is kind of scattershot. Ideally we can intercept HTTP status code bearing exceptions for all providers and apply the heuristics in the function linked to above. Then, in addition we can add other exception types over time that we've noticed are retryable. |
That works as well, if you're OK with a breaking change to the API.
Didn't realise this! Thanks
Yep, this makes sense. |
It would be useful to display the number of retried responses, so that there is some indication of what's going on if lots of requests are being retried without being 429s. OTOH, we don't want to overload the in-progress display with an ever-increasing amount of information. If we're going to have just one, IMO the total number retried is more important than the number rate limited. What do you think? |
You might want to hold off on this only because we've got another set of related changes we want to make soon: being able to detect exactly the time taken for inference (vs. retries) on a call to Anyway, all the code related to retrying/rate limits is due for an overhaul (and this is the top priority of one of our biggest users). This will probably go down in the next 3-4 weeks so I would wait for this to land (or even be in progress and then we can work on it together). |
I'd just rename the stuff inside Inspect but still call the old API for backwards compatibility. |
Sure, happy to wait. Keep us posted on this issue. |
Sorry, how would this work? At first glance I don't see how this would work with Python inheritance (unless doing some really gnarly introspection stuff that I don't think we should do). Could you post a code snippet demonstrating what you mean? |
Yes, some introspection (that's very frequently what you need to do to provide graceful backward compatibility). We want both of these things to be true: we almost never break people and we evolve our APIs over time to make them more elegant. Upholding those principles is IMO more important than the principle of "never do anything gnarly". |
I see, and fair enough. What would this look like? A code snippet would be perfect for me to learn about this. |
Something like this: def is_overridden(method_name: str, subclass: Type, base_class: Type) -> bool:
return getattr(subclass, method_name).__func__ != getattr(base_class, method_name).__func__
def is_rate_limit_overridden(model: ModelAPI) -> bool:
return is_overridden("is_rate_limit", type(model), ModelAPI) |
Currently, we retry a provider error if it's a rate limit:
inspect_ai/src/inspect_ai/model/_model.py
Line 322 in ac268d1
But there are other errors that should be retried, such as undifferentiated 500s, arguably 502s, timeouts, and things like:
Currently these notions are sometimes conflated in the code, e.g. here a ReadTimeOut is treated as a rate limit:
inspect_ai/src/inspect_ai/model/_providers/mistral.py
Lines 173 to 179 in ac268d1
My proposal is to introduce a new method
should_retry
toModelAPI
to capture the broader notion of an exception that should be retried. If you agree with the idea, I can take a crack at implementing this.In the in-progress display, we should probably also replace
HTTP Rate Limits
withHTTP Retries
or something.The text was updated successfully, but these errors were encountered: