-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect retries on connection that still succeeded #8587
Comments
Here's some additional logs leading up to the exception. 2024-07-11 16:01:30.624 11418-14990 System.err W OkHttp Extra [2024-07-11 16:01:30] Q10000 starting : OkHttp ConnectionPool |
Here's the events log: 2024-07-12 17:00:16.543 26646-27016 System.err W OkHttp Extra [2024-07-12 17:00:16] Q10000 scheduled after 0 µs: OkHttp ConnectionPool The log says response failed at 17:00:20.668, even though the request was received and the server answered. |
Just confirmed that the failure has nothing to do with header sizes by checking an erroneous response:
|
That’s surprising. There’s gonna be a bug in our code that enforces the 256 KiB limit on headers. The root cause exception is here:
The limit is supposed to start at (256 * 1024) and shrink as headers are processed. Either way we should improve the error messaging here. |
Do you have a URL that consistently triggers this? If you’d like to share it with me privately, my email address is posted at the bottom of publicobject.com. |
@swankjesse No, I don't have a specific URL : it can happen on any URL we call. As far as I can tell, it's more of a time issue. The problem never happens during the first few minutes after launching the app, just after a few minutes. Thus my conclusion that it is probably more of a thread pool handling problem. As if a thread was reused, but when the response is received, the connection would be considered stale and discarded, even though it is clearly still working (the server received the request and answered it). |
Interesting! I wonder if we could reproduce this with a single URL and a loop. If you have any URL that I can use to attempt to reproduce this, that’ll help me to fix it. |
(Or a test that uses MockWebServer and a loop!) |
I don't have anything available, but I think you could use anything: as I mention, it can happen on pretty much any URL that we call. I'll try and see if I can come up with a way to reproduce, but no guarantees. |
In fact, the same problem can also occur in OkHttp 3.x. The fundamental reason is that there is something wrong with the judgment of "requestSendStarted = e!is ConnectionShutdownException". okhttp3.internal.http2.StreamResetException is not ConnectionShutdownException requestSendStarted = false The HTTP request has already been sent out, but it is considered that the request sending has failed, and then the request is sent again. It is very likely to happen on mobile devices. That is, just after a request is sent out, the network connection changes, such as switching from Wi-Fi to 4G. error: okhttp3.internal.http2.StreamResetException: stream was reset: CANCEL |
I don't think that fits my issue: when I set a proxyman in between, I can
see that the server receives the request properly and answers with a
response. What I've seen is that the connection is dropped on the client
side, not on the server. Using retryOnFailure would work fine if the server
was the one dropping the connection (that's what it's for), but it doesn't
work in my case because the request has already been processed the first
time and the second time, it fails or end up with a duplicate transaction.
Le lun. 30 déc. 2024, 06:35, lclc98 ***@***.***> a écrit :
… This sounds like a similar issue I ran into, I was making http1.1 requests
though.
It seems to occur when the server killed the connection (Based on server
keep-alive), okhttp would attempt to make a request on the closed
connection and give the same EOFException.
retryOnConnectionFailure seems to be a workaround as it will make the
request again, or make connections close after each request but both were
not ideal as I lost visibility for when it would happen.
The following is what I used to replicate the issue (tested again nginx
docker with keepalive_timeout set to 5 seconds)
fun main() {
val client = OkHttpClient()
.newBuilder()
.retryOnConnectionFailure(false)
.build()
val request = Request.Builder()
.url("http://localhost:8080")
.build()
client.newCall(request).execute().use { response ->
println(request)
}
Thread.sleep(10000)
client.newCall(request).execute().use { response ->
println(request)
}
}
—
Reply to this email directly, view it on GitHub
<#8587 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEF3IGHMTJTLAGNAVPH46KL2IDLSDAVCNFSM6AAAAABSEYMZXOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRVGA2DGNZYHE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Sorry, you are right, I will delete my comment so I don't add confusion to the issue. When i was looking at wireshark, I thought I was also getting a similar duplicate request issues, but doing more testing that doesn't seem to be the case. |
We've wanted to upgrade OkHttp v4.x for quite a while now, but we can't because of some change regarding the connection pool and its impact on the retry policy. In some cases, we can see a connection going to a server and being answered (we use proxyman to track network flows, and we can see both request and response going through), but for some reason, OkHttp decides to silently drop that connection even though it was successful and retry it (not sure if the connection was deemed stale and thus closed). So we see a second request going through in proxyman (which ends up with an error from the server in some cases because it was already processed the first time, authentication for instance), and in our code, the only result that we receive is that second one (the server error).
We've encountered the problem with all v 4.x versions that we've tested so far, but never had the problem with v3.x (currently 3.14.9). The exact conditions are hard to reproduce, but if you disable the retryOnConnectionFailure, you end up with an IOException. This means that for some reason, the socket was closed by OkHttp v4.x, in a way that did not happen in OkHttp v3.x, leading to this exception:
Error response
java.io.IOException: unexpected end of stream on https://www.xxxxxxxxxx.fr/...
at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:210)
at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.kt:110)
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.kt:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:517)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1137)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:637)
at java.lang.Thread.run(Thread.java:1012)
Caused by: java.io.EOFException: \n not found: limit=0 content=…
at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.kt:335)
at okhttp3.internal.http1.HeadersReader.readLine(HeadersReader.kt:29)
at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:180)
at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.kt:110)
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.kt:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:517)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1137)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:637)
at java.lang.Thread.run(Thread.java:1012)
at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:525)
at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:210)
at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.kt:110)
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.kt:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:517)
at okhttp3.internal.http1.HeadersReader.readLine(HeadersReader.kt:29)
at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:180)
Every time that we've tried to upgrade and faced the issue, reverting to OkHttp 3 solved the problem.
PS : I opened that issue a few month ago, but it got closed erroneously (the bug that was fixed had nothing to do with the original problem.
The text was updated successfully, but these errors were encountered: