Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: broken tx retries for cluster clients after #697 #709

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rueian
Copy link
Collaborator

@rueian rueian commented Dec 22, 2024

No description provided.

return nil
}
count.m[p]++
count.m[cc]++
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make naming consistent.

break
}
}
if last == cmds.InitSlot {
return nil
}
} else if init {
cc := c.pslots[last]
count.m[cc] += inits
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't count cmd.InitSlot commands to c.pslots[last] in the code above this, so add the count back here.

@rueian rueian force-pushed the fix-cluster-tx-retry branch 3 times, most recently from dbe3db5 to 6bfd890 Compare December 23, 2024 02:26
@rueian rueian changed the title fix: broken tx retris for cluster clients after #697 fix: broken tx retries for cluster clients after #697 Dec 23, 2024
@wyxloading
Copy link
Contributor

wyxloading commented Dec 23, 2024

Find one extreme case:

MULTI
MULTI
GET foo
EXEC

We will get [4]RedisResult with all successful results. And resp[3] (which means EXEC result) will get a len == 3 array, that's not accurate right. The second MULTI should get a error ERR MULTI calls can not be nested and the transaction can EXEC though.

EDIT: maybe we should find the nearest successful MULTI command, but not the nearest MULTI command, if we want to find the whole transaction willing to retry ?

@rueian
Copy link
Collaborator Author

rueian commented Dec 23, 2024

The second MULTI should get a error ERR MULTI calls can not be nested and the transaction can EXEC though.

I just tried the double MULTI case but got an EXECABORT when doing EXEC. I think we shouldn't retry the transaction if the nearest MULTI command fails with a Redis error.

@wyxloading
Copy link
Contributor

wyxloading commented Dec 23, 2024

I think we shouldn't retry the transaction if the nearest MULTI command fails with a Redis error.

make sense though.

BTW i got different response from redis:7.4 cluster

cmd:

MULTI
MULTI
PTTL foo
EXEC

response

OK
ERR MULTI calls can not be nested
QUEUED
1) (integer) -2

EDIT:
If we sent the second MULTI command after PTTL foo, EXEC with same result.
If key foo belong to slot in state migrating, it will get ASK error, seems like it will fit in the code that find out a transaction and retry logic

@rueian rueian force-pushed the fix-cluster-tx-retry branch from 6bfd890 to f61ee99 Compare December 23, 2024 05:25
Comment on lines +679 to +682
for mi = i; mi >= 0 && !isMulti(commands[mi]) && !isExec(commands[mi]); mi-- {
}
for ei = i; ei < len(commands) && !isMulti(commands[ei]) && !isExec(commands[ei]); ei++ {
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both mi and ei cursor will stop at either MULTI or EXEC to avoid crossing tx boundaries.

}
for ei = i; ei < len(commands) && !isMulti(commands[ei]) && !isExec(commands[ei]); ei++ {
}
if mi >= 0 && mi < ei && ei < len(commands) && isMulti(commands[mi]) && isExec(commands[ei]) && resps[mi].val.string == "QUEUED" { // a transaction is found.
Copy link
Collaborator Author

@rueian rueian Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the nearest MULTI didn't succeed, we don't retry the tx.

@rueian rueian force-pushed the fix-cluster-tx-retry branch from f61ee99 to b4c543d Compare December 23, 2024 05:47
@rueian
Copy link
Collaborator Author

rueian commented Dec 23, 2024

I think we shouldn't retry the transaction if the nearest MULTI command fails with a Redis error.

make sense though.

BTW i got different response from redis:7.4 cluster

cmd:

MULTI
MULTI
PTTL foo
EXEC

response

OK
ERR MULTI calls can not be nested
QUEUED
1) (integer) -2

EDIT: If we sent the second MULTI command after PTTL foo, EXEC with same result. If key foo belong to slot in state migrating, it will get ASK error, seems like it will fit in the code that find out a transaction and retry logic

I tested these cases on Valkey 8, and they all failed with EXECABORT. But anyway, I think we should just not retry on these wired cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants