-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: missing signers' votes #1243
base: main
Are you sure you want to change the base?
Conversation
signer/src/storage/postgres.rs
Outdated
async fn get_deposit_signer_decisions( | ||
&self, | ||
chain_tip: &model::BitcoinBlockHash, | ||
context_window: u16, | ||
signer_public_key: &PublicKey, | ||
) -> Result<Vec<model::DepositSigner>, Error> { | ||
sqlx::query_as::<_, model::DepositSigner>( | ||
r#" | ||
WITH RECURSIVE context_window AS ( | ||
-- Anchor member: Initialize the recursion with the chain tip | ||
SELECT block_hash, block_height, parent_hash, created_at, 1 AS depth | ||
FROM sbtc_signer.bitcoin_blocks | ||
WHERE block_hash = $1 | ||
|
||
UNION ALL | ||
|
||
-- Recursive member: Fetch the parent block using the last block's parent_hash | ||
SELECT parent.block_hash, parent.block_height, parent.parent_hash, | ||
parent.created_at, last.depth + 1 | ||
FROM sbtc_signer.bitcoin_blocks parent | ||
JOIN context_window last ON parent.block_hash = last.parent_hash | ||
WHERE last.depth < $2 | ||
), | ||
transactions_in_window AS ( | ||
SELECT transactions.txid | ||
FROM context_window blocks_in_window | ||
JOIN sbtc_signer.bitcoin_transactions transactions ON | ||
transactions.block_hash = blocks_in_window.block_hash | ||
) | ||
SELECT | ||
deposit_signers.txid | ||
, deposit_signers.output_index | ||
, deposit_signers.signer_pub_key | ||
, deposit_signers.can_sign | ||
, deposit_signers.can_accept | ||
FROM transactions_in_window transactions | ||
JOIN sbtc_signer.deposit_signers USING (txid) | ||
WHERE deposit_signers.signer_pub_key = $3 | ||
"#, | ||
) | ||
.bind(chain_tip) | ||
.bind(i32::from(context_window)) | ||
.bind(signer_public_key) | ||
.fetch_all(&self.0) | ||
.await | ||
.map_err(Error::SqlxQuery) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple optimization could be to specialize a bit the query and only get decision of the deposits which have less than the total number of signers entries in the table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is doing something slightly different from what we initially talked about: here we pick the deposits that were confirmed (on the canonical chain) in the last three blocks, but this means that if we fetch from Emily a deposit that was already confirmed but submitted (to emily) only recently we may not retry it.
This is fine wrt the original issue (signers getting a vote for a viewed-as-unconfirmed deposit), but if we were to have a real retry 3 times approach we would also add redundancy to lost votes messages (for other, unrelated reasons) -- but it would require adding more info somewhere else in the db, so maybe it's not really worth it.
Also a possible simplification here could be to ignore the canonical chain, but given the context window is so small the recursive part shouldn't be an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great thinking on the edge case of missing older transactions! A simple approach that avoids modifying the database could be to use the created_at
column of the bitcoin block in the bitcoin_blocks
table relative to the selected context_window. With each new block, we can retrieve the creation time of the block received N-updates ago and fetch all decisions with a created_at
greater than that block's creation time. The only downside is that if blocks arrive very close to each other, we might end up retrying a vote an additional time. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's a nice way to avoid extra stuff! And an extra retry shouldn't really be an issue, we would just send 4x more votes message but I don't think that's overwhelming -- but maybe we should double check, just in case.
I'm okay with leveraging created_at
, but maybe let's check also with @djordon or @cylewitruk , just to be sure it's not an overkill or we are missing some other corner case.
chain_tip: &BitcoinBlockHash, | ||
) -> Result<(), Error> { | ||
for decision in decisions.into_iter().map(SignerDepositDecision::from) { | ||
self.send_message(decision, chain_tip).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this retry is a "best effort" type of thing, should we just log the error in here instead of bailing at the first one?
signer/src/storage/postgres.rs
Outdated
async fn get_deposit_signer_decisions( | ||
&self, | ||
chain_tip: &model::BitcoinBlockHash, | ||
context_window: u16, | ||
signer_public_key: &PublicKey, | ||
) -> Result<Vec<model::DepositSigner>, Error> { | ||
sqlx::query_as::<_, model::DepositSigner>( | ||
r#" | ||
WITH RECURSIVE context_window AS ( | ||
-- Anchor member: Initialize the recursion with the chain tip | ||
SELECT block_hash, block_height, parent_hash, created_at, 1 AS depth | ||
FROM sbtc_signer.bitcoin_blocks | ||
WHERE block_hash = $1 | ||
|
||
UNION ALL | ||
|
||
-- Recursive member: Fetch the parent block using the last block's parent_hash | ||
SELECT parent.block_hash, parent.block_height, parent.parent_hash, | ||
parent.created_at, last.depth + 1 | ||
FROM sbtc_signer.bitcoin_blocks parent | ||
JOIN context_window last ON parent.block_hash = last.parent_hash | ||
WHERE last.depth < $2 | ||
), | ||
transactions_in_window AS ( | ||
SELECT transactions.txid | ||
FROM context_window blocks_in_window | ||
JOIN sbtc_signer.bitcoin_transactions transactions ON | ||
transactions.block_hash = blocks_in_window.block_hash | ||
) | ||
SELECT | ||
deposit_signers.txid | ||
, deposit_signers.output_index | ||
, deposit_signers.signer_pub_key | ||
, deposit_signers.can_sign | ||
, deposit_signers.can_accept | ||
FROM transactions_in_window transactions | ||
JOIN sbtc_signer.deposit_signers USING (txid) | ||
WHERE deposit_signers.signer_pub_key = $3 | ||
"#, | ||
) | ||
.bind(chain_tip) | ||
.bind(i32::from(context_window)) | ||
.bind(signer_public_key) | ||
.fetch_all(&self.0) | ||
.await | ||
.map_err(Error::SqlxQuery) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is doing something slightly different from what we initially talked about: here we pick the deposits that were confirmed (on the canonical chain) in the last three blocks, but this means that if we fetch from Emily a deposit that was already confirmed but submitted (to emily) only recently we may not retry it.
This is fine wrt the original issue (signers getting a vote for a viewed-as-unconfirmed deposit), but if we were to have a real retry 3 times approach we would also add redundancy to lost votes messages (for other, unrelated reasons) -- but it would require adding more info somewhere else in the db, so maybe it's not really worth it.
Also a possible simplification here could be to ignore the canonical chain, but given the context window is so small the recursive part shouldn't be an issue.
signer/src/storage/postgres.rs
Outdated
WITH RECURSIVE context_window AS ( | ||
-- Anchor member: Initialize the recursion with the chain tip | ||
SELECT block_hash, block_height, parent_hash, created_at, 1 AS depth | ||
FROM sbtc_signer.bitcoin_blocks | ||
WHERE block_hash = $1 | ||
|
||
UNION ALL | ||
|
||
-- Recursive member: Fetch the parent block using the last block's parent_hash | ||
SELECT parent.block_hash, parent.block_height, parent.parent_hash, | ||
parent.created_at, last.depth + 1 | ||
FROM sbtc_signer.bitcoin_blocks parent | ||
JOIN context_window last ON parent.block_hash = last.parent_hash | ||
WHERE last.depth < $2 | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use bitcoin_blockchain_of
here?
signer/tests/integration/postgres.rs
Outdated
// Test data contains 5 deposit requests, we should get decisions for | ||
// the last 3. | ||
for deposit in test_data.deposit_requests[2..].iter() { | ||
assert!(deposit_decisions.iter().any(|decision| { | ||
decision.txid == deposit.txid | ||
&& decision.output_index == deposit.output_index | ||
&& decision.signer_pub_key == *signer_pub_key | ||
})); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't always work: TestData
generates a chain with forks, so picking the last 3 deposits requests doesn't guarantee that those 3 are all on the canonical chain; it does work with 51
as random seed, but if you change that to 52
the test fails.
Description
Closes: #1237
Changes
deposit_decisions_retry_window
param to the signers' settings (defaults to 3)DepositSigner
entries from the DB from the last N blocksTesting Information
Checklist: