Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understand behavior with sandbox-old-api #119

Open
gfr10598 opened this issue Nov 30, 2018 · 2 comments
Open

Understand behavior with sandbox-old-api #119

gfr10598 opened this issue Nov 30, 2018 · 2 comments
Assignees

Comments

@gfr10598
Copy link
Contributor

gfr10598 commented Nov 30, 2018

I created and pushed a new branch called sandbox-old-api. This includes a change in how batch processing is done, that evolved from refactoring work Ya and I have been doing.

When deployed around 5:54 Eastern, this had a huge impact on the pipeline throughput. The impact is much larger than anticipated, so we should try to understand what was happening before and after.

Is this just because this effectively rolled back changes that Ya deployed in another sandbox?

NOTE: The last merge on master likely broke annotation. Since it autodeployed to staging, annotation in staging may no longer be working properly.

The sandbox-old-api should fix the breakage, so we may want to merge it sooner rather than later.

@gfr10598
Copy link
Contributor Author

gfr10598 commented Dec 1, 2018

Looks like the weirdness is actually from the previous sandbox deployment.

From 5:30pm on the 28th (Eastern) to about 4pm on the 29th, there was negligible traffic (to sandbox) from ndt or sidestream. This may be because of PT traffic problems. It looks like the annotator response time was in excess of 30 seconds, which may have been causing the horrible PT throughput, and they may have in turn caused there to be virtually no workers available to NDT.

@gfr10598
Copy link
Contributor Author

gfr10598 commented Dec 1, 2018

The prometheus logs show an annotator version (not ss) from 20181126t112333. This is a bit confusing, as there doesn't seem to be a corresponding travis build. There were "load-multiple" builds around 1pm (travis time), but no sandbox builds on any nearby days. Perhaps this is a manual deployment.

The same deployment seems to have behaved better at different times. It was just this one day window from 28th to 29th that things seemed very bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants