-
-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interop runner counts one connection attempt too many when parsing logs #125
Comments
Here's how the number of handshakes is counted: https://github.com/marten-seemann/quic-interop-runner/blob/56fcba0285f063d8e0f4f8d068335d2c85712f5f/testcases.py#L163-L168 Why is this leading to overcounting in this case? |
In the C1 case, it happens if a client's initial is corrupted in a way that changes the CID. The server will reply to both the correct Initial and the corrupted Initial. In the L1 case, it happens if a server times out one connection attempt and starts a new one later. This is indeed what happened here:
I confirmed that scenario by checking the client side qlog and the corresponding server side qlog. (I am creating an issue in Picoquic because the two connections where logged under the same ICID-derived file ID and the second file erased the first one -- I should really derive the file ID from the SCID.) If you counted the handshake with the client side PCAP, you would could 50 connections because none of the Server Initial with the first SCID made it to the client. In fact, there is a remarkable set of circumstance that caused the client to not receive any of the server first flight, and the server to not receive any of the repeated client initial with sequences 1 to 6. Given a 30% loss rate, this kind of loss sequence is supposed to have a probability of about 0.002%, but it seems to happen much more frequently than that. |
That shouldn't happen though. If the CID is corrupted, the AEAD tag won't be correct. Or am I missing something here? |
Yes, you are right about C1. But the L1 case was pretty clear once I managed to identify the logs. |
Looking at the logs of this handshake loss trial, I find that the test failed because
Expected 50 handshakes. Got: 51
. Yet, the client and server logs only mention 50 connections, and the client downloaded the expected 50 files.The test of the number of handshakes is probably mistaking a failed attempt for a successful one, leading to a false negative in the result. False negatives are unhelpful. They waste time when people investigate, and they push people to just dismiss test results without investigating them further.
The text was updated successfully, but these errors were encountered: