-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write txn shouldn't End() on a failure #18679
Comments
Context why this was introduced #14149 |
As I read through that change, it seems the goal was to make timeouts of read-only txns not cause panic. But the change also makes undesirable side-effect of letting the partial-write end before panic. |
This comment from @ptabor was quite relevant actually (the miss seems to be to not panic right away):
@ahrtr - please share any add'l context you might have from that PR - or I can make a change to remove txn.End and test |
Thanks @shyamjvs for the good catch. Right, we need to remove the etcd/server/etcdserver/txn/txn.go Lines 310 to 311 in f1aefa5
|
Looks like a good place for a failpoint. |
Sorry this took a while, I sent out a fix above. @serathius I'm happy to add failpoints too, but had a few questions first around how to make them effective. Could we discuss over a call (maybe next SIG sync)? |
See example in #17555 |
Open to track the backport effort #18749 (comment) |
What happened?
Offshoot of #18667 (comment)
When two nodes both execute the same write txn (i.e validation step already passed on both), but one ends up in failed execution and the other in success, we shouldn't let the failed node increment its CI and move on - but explicitly fail and crash at the old CI - because we never expect a write txn to fail. Otherwise it would cause asymmetric side-effect across nodes and lead to data inconsistency.
Here's where I think our code may be potentially violating that:
etcd/server/etcdserver/txn/txn.go
Lines 307 to 314 in c1976a6
If I'm reading that correctly, when we run into a txn failure (which may contain some partially executed writes) the txn is ended before triggering server panic - which means there could be a KV rev and CI bump with some incorrect changes. So if/when the server restarts, it will assume it's caught up (till that badly applied txn) and move on to the next record leading to inconsistent state. Reproducing such failure is going to be hard iiuc but it seems like we should panic without calling txn.End here?
What did you expect to happen?
The etcd node on which write txn execution fails should crash without trying to commit the failed transaction (or other side effects like incrementing KV revision or CI).
Anything else we need to know?
There was no observed failure or a confirmed repro I have for this issue, but bringing it up as a potential risk in our current code.
Please share any thoughts about making the above change/trying to repro or if you think this is a non-issue.
/cc @serathius @ahrtr
The text was updated successfully, but these errors were encountered: