-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcd Go & Java client SDK's retry mechanism may break Serializable
#18424
Comments
@lavacat are you interested in working on this issue as discussed in the community meeting? |
Serializable
and Read committed
Serializable
Hi, I'd like to work on this issue and would appreciate some guidance. Could we discuss the details here or on Slack, if that's more convenient? |
Thank you. Unfortunately, this issue isn't
Let me know if you are working on this. I will work on it if I do not see a response by the end of next week. |
@ahrtr, yes, will try to find time this week. Please assign to me. |
/assign @lavacat |
Potential duplication of non-idempotent requests is a known problem with Raft-based systems. There's half a section dedicated to it in the extended Raft paper, in section 8:
The proposed solution is somewhat heavy-handed and requires deep changes - including in the server (because the de-duplication logic should be part of the replicated state machine), but it seems difficult to solve the problem without it. If you simply tell the client "don't retry non-idempotent operations on failure", it adds significant burden on the user for handling such failures. |
Your comment is not exactly the same as this issue, but the two are somewhat related. For a distributed system, it's possible that a client may somehow get an error response due to whatever temporary issue (e.g. network jitter) but actually the server side may have already successfully processed the requests. It's rare, but it happens. From client perspective, if it gets a successful response, then it can trust the response. But if it gets an error response, it doesn't mean that the server side indeed fails. Note that the following comment focuses on the case that the client gets an error response. etcd has two kinds of data, key space (key/value) data and non key-space data (i.e. membership data). Key spaceFor key space data, etcd supports MVCC (multi-version concurrent control), refer to here. When the client gets an error response for the request against the key space, it has two choice.
Obviously the first choice is much simpler, but with minor problems. The second doesn't have the problems, but more complex. Non-key spaceFor non-key space (i.e. membership data), etcd doesn't support MVCC. The client can still follow similar patter (check before operation), but it can't use TXN. Please refer to an example in kubernetes/kubeadm#3111. |
any progress? @lavacat |
I'm focusing on reproducing interleaving transactions with default go client. We have 3 possibilities for retry:
We'll ignore 2 for now (but auth is checked before proposal enters raft).
I think in this case there is no chance first attempt got to raft. We left only with 1. It becomes interesting. I'd expect that if server side generates Cancel or Deadline error, client should retry. But that's not the case for Txn. parseProposeCtxErr converts So, it's not possible for Txn to retry. For java client logic is different and I think that's the reason Jepsen test failed. |
Indeed it seems like a bug. But let's do not change it until there is a clear summary on the existing errors (see also #18493 (comment)). You can intentionally inject an error in
|
It might not be safe to automatically retry when seeing context error, because the server side may have successfully processed the request. etcd/client/v3/retry_interceptor.go Line 72 in c79c7d5
etcd/client/v3/retry_interceptor.go Lines 348 to 350 in c79c7d5
Based on the investigation that we have done so far.
So this issue can't be reproduced with etcd golang SDK. So I will unpin the issue and deescalate the priority. Proposed followup actions:
|
Background
Jepsen team raised an issue #14890, and stated that etcd may cause lost update and cyclic information flow. There is a long discussion.
Firstly, there is strong evidence to indicate that it isn't an etcdserver issue, and a key was written twice by the client. Refer to #14890 (comment). So we thought it might be jetcd or Jepsen's issue.
Eventually it turned out to be caused by client's retry mechanism. Refer to
etcd/client/v3/client.go
Line 273 in 9f58999
Note Jepsen uses jetcd (java client). But I believe etcd go client sdk also has this issue.
Breaks
Serializable
When a database system processes multiple concurrent transactions, it must produces the same effect as some serial execution of those transactions. This is what the
Serializable
means.But etcd client sdk's retry (including both go & java) mechanism may break
Serializable
.Let's work with an example, assuming there are two concurrent transactions,
Based on the definition of
Serializable
, the final result must be the same as executing the two transaction as some serial execution. There are only two possibilities,But client's retry may lead to a third possibility, see an example workflow below
So finally it leads to cyclic information flow, so it breaks
Serializable
Break
Read Committed
Let's work with an example/workflow,
Obviously, from client perspective, it should read 277/4 in such case, because it's confirmed committed. So it breaks
Read Committed
.Read Committed
means client sees uncommitted data or dirty read.EDIT: even without the client's retry, it's also possible for users to run into this "issue", because it's possible that an user may get a failure response but etcdserver actually has already successfully processed the request. We know it's a little confusing to users, but it isn't an issue from etcd perspective. The Proposal (see below) can mitigate it, but can't completely resolve it.
What did you expect to happen?
etcd should never break
Serializable
, norRead committed
How can we reproduce it (as minimally and precisely as possible)?
See workflow mentioned above. We need to create two e2e test cases to reproduce this issue.
We can leverage gofailpoint to reproduce the Serializable issue. When etcdserver receives two transaction requests, it intentionally return a failure response for the first transaction only once; when etcdserver receives the retried failed transaction, it should return success.
We also leverage
sleep
gofailpoint to interleave the execution of the two transaction.Proposal
see also #14890 (comment)
Action
Reference
The text was updated successfully, but these errors were encountered: