Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle consensus #350

Merged
merged 29 commits into from
Sep 17, 2024
Merged

handle consensus #350

merged 29 commits into from
Sep 17, 2024

Conversation

andrewkmin
Copy link
Collaborator

@andrewkmin andrewkmin commented Sep 4, 2024

Summary & Motivation

$title

TLDR: Previously, the ACTIVITY_STATUS_CONSENSUS_NEEDED case was blatantly unhandled, resulting in misleading error messages/failure modes.

This change does the following:

  • makes changes to the base sdk-browser and sdk-server clients to handle non-terminal activity statuses
  • update various client libraries that depend on sdk-browser and sdk-server to allow for awaiting consensus
  • adds helpers to retrieve signed transactions + signed messages once consensus has been met on the Turnkey side (i.e. in case of a timeout)
  • @turnkey/viem now returns viem BaseErrors that wrap around TurnkeyActivityErrors. This also implicitly addresses err.walk is not a function #121
  • includes new example scripts for the ethers, viem, and solana packages (e.g. retry.ts and consensus.ts) which exercise this new functionality

The end user experience is such that if there is a policy enforcing consensus, you can configure a poller to await consensus to be met. See screen recordings in the testing section below.

TODO:

  • add unit tests (ideally — this requires policies)

How I Tested These Changes

local examples

Live polling for consensus during Ethers signing scripts:

Screen.Recording.2024-09-07.at.1.49.25.AM.mov

Consensus is not met at first within the polling retry limit, then later met, and the signature is retroactively retrieved + added to a Solana transaction
https://github.com/user-attachments/assets/af7fb6e3-46df-4785-a8ec-e2e7d4d1ff7e

Did you add a changeset?

If updating one of our packages, you'll likely need to add a changeset to your PR. To do so, run pnpm changeset. pnpm changeset will generate a file where you should write a human friendly message about the changes. Note how this (example) includes the package name (should be auto added by the command) along with the type of semver change (major.minor.patch) (which you should set).

These changes will be used at release time to determine what packages to publish and how to bump their version. For more context see this comment.

Copy link

codesandbox-ci bot commented Sep 4, 2024

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

});

return assertNonNull(activity?.signedTransaction);
if (activity.status !== "ACTIVITY_STATUS_COMPLETED") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly does this look like in the case where an activity requires consensus? Should an error be thrown?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question — ended up having the default behavior be to poll for 3 seconds, then return an error. At that point, it's up to the end-user to figure out how they want to handle the error. I've included some examples of handling in the various retry.ts files

examples/with-ethers/src/consensus.ts Outdated Show resolved Hide resolved
Comment on lines 63 to 66
// 1. Sign a raw payload (`eth_sign` style)
const message = "Hello Turnkey";
const signature = await connectedSigner.signMessage(message);
const recoveredAddress = ethers.verifyMessage(message, signature);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes this example "consensus.ts"? I think consensus vs. not will be driven by the organization setting? (1 vs N root quorum users?)

(basically, it doesn't seem necessary to have separate consensus examples unless I'm missing something 🤔 )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call! not that useful. retry.ts should be enough to serve the purpose of demonstrating the scenario where one might be waiting for consensus

Comment on lines 146 to 151
throw new TurnkeyActivityError({
message: `Unexpected activity status: ${activity.status}`,
activityId: activity.id,
activityStatus: activity.status as TActivityStatus,
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key is to throw a different error when the activity is in need of consensus. Something like:

if (activity.status == "ACTIVITY_STATUS_NEEDS_CONSENSUS") {
    throw new TurnkeyActivityNeedsConsensus({
          message: `Activity requires consensus`,
          activityId: activity.id,
          activityStatus: activity.status as TActivityStatus,
        });
}

Then I agree with @andrewkmin, the retry or error behavior can be up to the caller.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ended up adding some bespoke errors to viem. the thing is it needs .walk() so I ended up extending the BaseError class. let me know what you think — if it adds unnecessary complexity, happy to cut it or reduce it down!

Copy link
Contributor

@zkharit zkharit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 🚢 🚀

@andrewkmin andrewkmin force-pushed the andrew/consensus branch 4 times, most recently from 8ffd1fb to c4952bd Compare September 11, 2024 00:20
@andrewkmin andrewkmin requested a review from r-n-o September 11, 2024 00:30
@andrewkmin andrewkmin force-pushed the andrew/consensus branch 2 times, most recently from b31ce3d to 7b99033 Compare September 11, 2024 19:32
@andrewkmin andrewkmin force-pushed the andrew/consensus branch 3 times, most recently from 9029f26 to ccc9a93 Compare September 13, 2024 14:37
Comment on lines 36 to 45
// The following config is useful in contexts where an activity requires consensus.
// By default, if the activity is not initially successful, it will poll a maximum
// of 3 times with an interval of 10000 milliseconds.
//
// -----
//
// activityPoller: {
// intervalMs: 10_000,
// numRetries: 5,
// },
Copy link
Contributor

@r-n-o r-n-o Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still relevant?

EDIT: I think it is! Ignore this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! a bit ambiguous with the numbers though; will change them such that it might be a bit less confusing


result = assertNonNull(activity?.result?.signRawPayloadResult);
} else {
result = await this.client.signRawPayload({
// const trying = this.client as TurnkeyClient;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment was probably not meant to be a permanent comment? :)

}
}

export function checkActivityStatus(input: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkActivityStatus implies it's a stateless innocuous function but I think this is more heavy than it looks: it will throw errors! So maybe assertActivityCompleted since that's kind of what it boils down to?

Copy link
Collaborator Author

@andrewkmin andrewkmin Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def more accurate!

activityStatus: TActivityStatus | null;
activityType: TActivityType | null;
cause: Error | null;
activityId: TActivityId | undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change from null to undefined?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went into this because of type compatibility, and ended up going with undefined for consistency. (functionally, null == undefined = true). Specifically, I feel like param: <Type X> | undefined better conveys the idea that the absence of the param is okay

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is activityId: TActivityId | undefined; equivalent to activityId?: TActivityId; then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case I'm a definite +1 on the change; simpler to reason about!

].includes(activity.type)
) {
throw new TurnkeyActivityError({
message: `Unexpected activity type: ${activity.type}`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the message could be "Cannot get signature from activity type ...etc...". Otherwise it's a bit unclear why it's "unexpected".

"ACTIVITY_TYPE_SIGN_RAW_PAYLOAD_V2",
].includes(activity.type)
) {
throw new TurnkeyActivityError({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also thinking this should not be called "TurnkeyActivityError". It's likely an SDK-level / user-level issue if this function is called on the wrong activity type. We might need a plain BadArgumentError? Otherwise users will be led to believe the issue is within the turnkey activity processing when they get this error.

packages/http/src/shared.ts Outdated Show resolved Hide resolved
Comment on lines 331 to 341
let attempts = 0;

const pollStatus = async (activityId: string): Promise<TResponseType> => {
const pollBody = { activityId };
const pollData = await this.getActivity(pollBody) as ActivityResponse;

if (attempts > maxRetries) {
return handleResponse(pollData);
}

attempts += 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Finally fixing the "retry forever" behavior! 🏆

packages/sdk-browser/src/__types__/base.ts Show resolved Hide resolved
Comment on lines +27 to +30
export type TTurnkeyConsensusNeededErrorType = TurnkeyConsensusNeededError & {
name: "TurnkeyConsensusNeededError";
};
export class TurnkeyConsensusNeededError extends BaseError {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of defining these errors here, can we pull that into http? Is there a way to comply with viem's error interface (implement walk is one thing; maybe others?) without importing viem's BaseError directly?

We'd need to implement this interface in our http/shared errors: https://github.com/wevm/viem/blob/408948bb74685e9d562cd39f50768f9b282463fe/src/errors/base.ts#L37-L83

I don't think it'd hurt anything to have our errors implement more methods than they need to, but maybe not something we need right now; feels a bit awkward to be pushed by a particular framework to implement these methods for all Turnkey errors.

Copy link
Collaborator Author

@andrewkmin andrewkmin Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the sentiment. I'm down to keep these encapsulated to just Viem for now. feels a bit awkward either way but we implement more at the base layer down the line if more cases arise!

Copy link
Contributor

@r-n-o r-n-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

}
}

export function assertActivityCompleted(input: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I'm seeing this with fresh eyes: the activity ID isn't used anywhere really, we're simply comparing the passed in status to CONSENSUS_NEEDED and COMPLETED. Don't think it's necessarily worth changing this but...a bit of an odd interface given nothing guarantees the passed in status come from the passed in activity ID :/

Maybe it'd be clearer if a TActivity was expected and passed in instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempting this here: aa16d4d

The drawback of this approach is that it's a bit duplicative -- e.g. we return something like (paraphrasing)

{
...
result: activityData["result"]["activityResult"],
activity: activityData
...
}

So you see the overlap there, but having the result up front and center just makes access easier. The reason we needed this in the first place was originally for type-compatibility, but I realized it generally could be helpful to return all activity details

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it's definitely not super clear-cut, but I like this new version: assertActivityCompleted(activity); reads suuuper nice 😍

@andrewkmin andrewkmin merged commit 1e695f8 into main Sep 17, 2024
5 checks passed
@andrewkmin andrewkmin deleted the andrew/consensus branch September 17, 2024 01:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants