Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client resiliance to provider server errors #88

Open
zaksoup opened this issue Jan 29, 2020 · 3 comments
Open

Client resiliance to provider server errors #88

zaksoup opened this issue Jan 29, 2020 · 3 comments

Comments

@zaksoup
Copy link

zaksoup commented Jan 29, 2020

What happens now?

Some providers have implementation issues with their MDS endpoints. It's common for standard requests to result in un-explained 500 errors that will disappear when retrying or for the remote end to disconnect mid-request.

What should happen?

I'd like to request that we investigate adding logic to retry requests on certain conditions, like the remote disconnecting mid-request or receiving a 500 error

How do we do that?

I'm opening this issue to discuss what the recommended course of action would be to improve the resilience of the client library.

@thekaveman
Copy link
Contributor

Exponential backoff or some other retry mechanism? I've typically "handled" (major air quotes) these errors by making large requests multiple times over a given time period, but this is not ideal for anyone. The current escape sequence is ripe for improvement.

Some related conversation can be found in #13 and #82

@zaksoup
Copy link
Author

zaksoup commented Jan 30, 2020

As an aside, the current code using is not only works for status codes < 256. I'm not a pythonista by any account so I spent a bit too long trying to figure out why

x = 200
x is 200
# true
y = 500
y is 500
# false

was happening. Turns out, is checks for object equivalence and for ints < 256 python uses the same object, but above that they'll be different objects...

@zaksoup
Copy link
Author

zaksoup commented Jan 30, 2020

On topic... I wrote a very (very very) quick-and-dirty attempt at making the code a bit more retryable, including to Connection errors. Any feedback on what would be more idiomatic python is extremely welcome. This is in client.py...

    @staticmethod
    def retryable_get(session, url, params):
        r = Client._get(session, url, params)
        wait_time = 1
        retries = 1
        while (r is None or should_retry(r.status_code)) and retries <= 12:
            if r is None:
                print(f"Connection Error, retrying")
            else:
                print(f"{r.status_code} received, sleeping #{wait_time} second")

            pretty_sleep(wait_time)
            r = Client._get(session, url, params)
            wait_time = wait_time * 2
            retries += 1

        if r is None:
            raise ConnectionError
        return r

    @staticmethod
    def _get(session, url, params):
        try:
            r = session.get(url, params=params)
            return r
        except ConnectionError as e:
            return None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants