-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: autoscaling.virtio-mem is flaky #1006
Comments
@Omrigan how long has it been flaky for? Can you link some example runs? |
I'd assume from the introduction.
Your PR: https://github.com/neondatabase/autoscaling/actions/runs/9878750381/job/27283615758#step:14:1034 I had 3 or 4 such cases, I think. However, maybe it is not that particular test, here is the other test failing: https://github.com/neondatabase/autoscaling/actions/runs/9890395470/job/27319200362#step:14:987 |
We should check if this is only timeouts or if we suspect an actual problem. Until then marking as P1. |
Haven't yet had a chance to look at it. However - @Omrigan if you're truly observing it fail 50% of the time, it may be due to your PR. The occurrence you flagged was the first time I'd seen it fail on one of my PRs. |
A couple recent runs:
Both cases look like they timed out after 5 minutes on downscaling. |
Reproduces locally in about ~2 hours. From the logs it seems like the issue is that VM uses too much memory sometimes and it prevents downscaling. I think the next step for debugging would be to print memory usage in logs periodically. For now I'll put this issue in selected, but want to get back to it later. My logs
|
This might be fixed by now. Arthur will reproduce and see if it is fixed. |
https://gist.github.com/petuhovskiy/226522b34bd85a3a8d2d8ee88fa43dbd Tried to reproduce, it failed only once in 360 runs. Let's consider it fixed for now. |
Fails roughly 50% of the time.
The text was updated successfully, but these errors were encountered: