-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nightly timed out on Github Actions, too #103
Comments
The last two nightlies both failed with 429 errors. Probably not related to https://meta.stackexchange.com/questions/400700/chat-search-does-not-bring-up-new-results (since it is reported to have started only today) but still mentioning, just in case. |
Experimentally add stderr logger for requests to debug issue #103
scrape_chat.py: Experimentally add stderr logger for requests to debug issue #103
Dumbfoundingly, adding more debug logging seems to have made the problem go away, at least for now. I'll leave the setting hardcoded for the time being. |
Nightly #1006 failed with a 429 error, apparently killing the chat client. |
This happened again, and I expect it to continue to happen again from time to time. My current thinking is that ChatExchange needs to be updated to cope with 429 errors correctly. There is some logic in there which attempts to handle them, but clearly it is not working in this case. The dreaded threaded design exacerbates this by hiding the error, so the bot continues to try even though the delivery thread is dead because of an unhandled exception. |
I finally managed to repro on my laptop. I ran
|
#1031 and #1032 seem to have failed for similar reasons. The tracebacks look quite different, but all of them feature 429 errors. I hadn't seen this before; some weird Javascript responses from the server:
|
The updated retry logic provides more details. It seems that attempting to post a message triggers what looks like probably a CAPTCHA page which then obviously fails. The Nightly #1051 transcript contains this sequence of events.
It is perhaps slightly weird that posting to chat is the operation which fails, whereas only posting to chat seems to be fine. I guess repeated accesses to the room info pages are what actually triggers the problem. I notice that e.g. tripleee$ curl -s https://chat.stackoverflow.com/robots.txt; echo
User-Agent: *
Allow: /transcript/
Allow: /rooms/schedule/export/
Disallow: /rooms
Disallow: /rooms/
Disallow: /users
Disallow: /users/
Disallow: /search
Disallow: /search/
Disallow: /login
Disallow: /login/
Disallow: /logout
Disallow: /logout/
Disallow: /feed
Disallow: /feed/
Disallow: /events
Disallow: /events/
Disallow: /chats
Disallow: /chats/
Disallow: /*?
Allow: /?tab=all&sort=active&page=*
Allow: /?tab=all&sort=active&page=*&nohide=*
Allow: /
# for "/*?", refer to https://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=40360
# remember that routes /like/this will get indexed, whereas routes /like/this?will=not
#
# beware, the sections below WILL NOT INHERIT from the above!
# https://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=40360
#
#
# disallow adsense bot, as we no longer do adsense.
#
User-agent: Mediapartners-Google
Disallow: /
#
# Yahoo bot is evil.
#
User-agent: Slurp
Disallow: /
#
# Yahoo Pipes is for feeds not web pages.
#
User-agent: Yahoo Pipes 1.0
Disallow: /
Sitemap: http://chat.stackoverflow.com/sitemap.xml |
It's been a lot more robust recently, but it timed out again a couple of nights ago. https://github.com/tripleee/sloshy/actions/runs/11546295410/job/32134442564 is actually registered as successful but the ping never arrived. |
Nightly #933 died with 429 errors just after I switched back to Github Actions from CircleCI (closed #87)
The text was updated successfully, but these errors were encountered: