Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No timeout for s3_is_overloaded #205

Closed
mlissner opened this issue Jul 24, 2017 · 7 comments
Closed

No timeout for s3_is_overloaded #205

mlissner opened this issue Jul 24, 2017 · 7 comments

Comments

@mlissner
Copy link
Contributor

mlissner commented Jul 24, 2017

I'm currently uploading about 3.4M items to the Internet Archive as part of the RECAP Project.

As I'm uploading, I've noticed my program freezing without recovery, which in the past has been because somebody (usually me) didn't remember to set the timeout on Python requests. This drives me nuts, and in this instance, it appears to be because s3_is_overloaded doesn't set a timeout. s3_is_overloaded also doesn't allow the user to set a timeout manually, so there's no easy fix on my end. I'm on version 1.6.0, but it looks like this is a problem on master as well.

Two thoughts:

  1. Is it possible to get a 1.7.0 release with this fix?

  2. If 1.7.0 isn't ready, is it possible to get a 1.6.1 release with this fix?

When these timeouts are missing, it freezes any program that relies on the code, possibly forever. It's a big issue. This could be resolved by adding a default timeout (as is elsewhere in the code), and I don't think it would cause any changes to the API that wouldn't be backwards compatible.

Thank you very much.

mlissner added a commit to mlissner/internetarchive that referenced this issue Jul 24, 2017
Pretty simple tune up to add `request_kwargs` as a param to the `s3_is_overloaded` method. Follows the pattern set by the other methods, and sets a default timeout of 12 seconds, as in the `get_metadata` method.
@jjjake
Copy link
Owner

jjjake commented Jul 24, 2017

Thanks for the feedback @mlissner.

e323993 should fix this. The timeout is set to 12, and if it is reached the method returns True (as in, yes, S3 is overloaded).

Thanks for catching this, I understand how frustrating this issue can be!

I will work on getting 1.7.0 released over the next couple of days, but you can also use the master branch in the mean time.

@jjjake jjjake closed this as completed Jul 24, 2017
@mlissner
Copy link
Contributor Author

I understand how frustrating this issue can be!

I guess it makes sense that an IA engineer would be familiar with the delights of this problem!

Thank you for the breakneck fix!

@jjjake
Copy link
Owner

jjjake commented Jul 25, 2017

@mlissner v1.7.0 is now available on pypi. Please let me know if you run into any issues.

@mlissner
Copy link
Contributor Author

Great news. I've upgraded and it'll probably take a day or so to notice if the issue is fixed. I suspect it will be, since I've checked all APIs we're using from this project, and this is the only one I saw that lacked a timeout.

@vxbinaca
Copy link

@mlissner Hey unrelated to the issue. but THANK YOU for your project. Way cooler than my little script I maintain that mirrors youtube videos to IA. Wow, awesome.

@mlissner
Copy link
Contributor Author

mlissner commented Aug 1, 2017

Confirmed. No more issues with timeouts for the APIs we use. 3.4M items uploaded. Thanks again.

@jjjake
Copy link
Owner

jjjake commented Aug 2, 2017

Thanks for the follow up @mlissner, that's great to hear.

Wow, you already uploaded all 3.4M? Awesome! Feel free to email me if you ever need help or suggestions for large scale uploads like that, happy to help (email is on my profile).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants