Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP Message Abnormality and Misordering After Connection Reset in OkHttp Retry #8618

Open
izdt opened this issue Dec 9, 2024 · 2 comments
Labels
bug Bug in existing code

Comments

@izdt
Copy link

izdt commented Dec 9, 2024

When using OkHttp with Spring Cloud Zuul, the following issues occur:

  1. During automatic retry triggered by a Connection Reset exception, the HTTP message body is not properly written, while the headers are successfully written.
  2. Subsequent requests may carry duplicated HTTP headers (previous retry's headers + current request headers and body), resulting in an invalid HTTP message.
  3. When processed by F5 BigIP load balancer, the malformed HTTP message is split into two requests, causing misordered communication in reused connections.

Issue Analysis

  1. Connection interruption after sending the request

    • If the connection is interrupted without receiving a normal response, OkHttp captures the Connection reset SocketException and enters its retry logic.
    • The RetryAndFollowUpInterceptor checks if the current connection chain can retry and returns true to proceed with retry execution.
  2. Two-step HTTP message writing (headers and body)

    • The CallServerInterceptor is responsible for writing the HTTP message.
    • The writing logic first writes headers to the buffered sink, then writes the body.
  3. Issue with the message body writing

    • OkHttp’s message body writing is handled by the abstract RequestBody.writeTo(BufferedSink sink) method from the okio package.
    • Its implementation relies on the RealBufferedSink class, which provides multiple writing methods.
    • In the Spring Cloud Zuul environment, the OkHttpRibbonRequest implementation invokes RealBufferedSink.writeAll(Source source).
    • The writeAll(Source source) method reads the message body from a stream. After a connection reset, the original stream is already closed, resulting in a body length of 0 being written.
  4. Mismatch in HTTP header and body length

    • When closing the output stream after writing the body, OkHttp detects a mismatch between the expected and actual written bytes, throwing an exception and failing the retry.
  5. Connection not closed properly after an exception

    • After encountering an exception, OkHttp does not close the current connection.
    • This leads to reused connections retaining the previous request's headers, resulting in cascading HTTP message errors.

Steps to Reproduce

Environment

  • OkHttp version: All versions (issue 1), 3.14.x (issue 2).
  • Spring Cloud Zuul setup.
  • Backend services behind an F5 BigIP load balancer.

Reproduction Steps

  1. Use OkHttp as the HTTP client in Zuul.
  2. Simulate a network condition causing Connection Reset exceptions to trigger OkHttp's retry mechanism.
  3. Observe HTTP message content for duplicated headers and misordered communication in reused connections.

Code Example

In Zuul’s OkHttpRibbonRequest, the HTTP body is written using the following logic:

@Override
public void writeTo(OutputStream outputStream) throws IOException {
    BufferedSink sink = Okio.buffer(Okio.sink(outputStream));
    Source source = Okio.source(this.getInputStream());
    sink.writeAll(source); // Critical method call
    sink.close();
}

Expected Behavior

  1. After encountering a Connection Reset exception, OkHttp should properly handle retries, ensuring the consistency of HTTP headers and body.
  2. OkHttp should immediately close the current connection when exceptions occur to prevent errors in subsequent reused connections.

Actual Behavior

  1. The message body is empty after retry, and only the headers are written.
  2. Subsequent requests include duplicated HTTP headers, causing malformed requests.
  3. The unclosed erroneous connection leads to cascading errors in subsequent reused requests.

Logs and Exception Messages

Relevant exceptions captured during the issue:

java.net.SocketException: Connection reset
	at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:248)
	...
java.io.EOFException: unexpected end of stream on ...
	at okio.RealBufferedSink.writeAll(RealBufferedSink.java:123)
	...

Proposed Fixes

  1. Improve Request Stream Handling

    • During retry logic, OkHttp should detect stream status and regenerate the request body to ensure consistency during writing.
  2. Enhance Exception Handling

    • When an exception is detected, the current connection should be immediately closed and removed from the connection pool:
      if (streamError) {
          connection.close();
      }

Additional Information

  • Affected Versions: Issue 1 occurs in all OkHttp versions; Issue 2 affects 3.14.x.
  • Testing Environment: Spring Cloud Zuul integrated with OkHttp.
@izdt izdt added the bug Bug in existing code label Dec 9, 2024
@JakeWharton
Copy link
Collaborator

Hard to know if this is a real issue drowned in AI slop, or just a normal exception that some AI hallucinated as an issue and then drowned in slop.

@izdt
Copy link
Author

izdt commented Dec 10, 2024

Hard to know if this is a real issue drowned in AI slop, or just a normal exception that some AI hallucinated as an issue and then drowned in slop.

This issue is not AI-generated.It may be that I used AI to template the issue, which made you misunderstand it was generated by AI. It’s a real problem we encountered in our production environment . We captured the network packet and reappeared this issue. All the analysis are based on real debugging and troubleshooting.

Use writeAll(Source source) method did throw a exception when retry after a connection reset. And when use 3.14.x version, after PR#4627, double HTTP header occurred in one http request like this:

Image

More serious is that, after we used F5 BigIP for load balancing, the issue of double HTTP headers caused the subsequent HTTP response misorder. If needed, I can provide more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug in existing code
Projects
None yet
Development

No branches or pull requests

2 participants