Skip to content

Conversation

@kvinwang
Copy link
Collaborator

@kvinwang kvinwang commented Feb 9, 2026

Summary

  • add optional requireTcbUpToDate flag to DstackApp (initializer + setter) and enforce it in isAppAllowed
  • extend KMS factory deploy to accept the new flag and keep a backward-compatible overload
  • update hardhat task, foundry cheatsheet, and tests

Upgrade notes (existing KMS deployments)

  • No storage layout change for DstackKms; upgrade is in-place via UUPS proxy.
  • Deploy new DstackKms implementation (e.g. npx hardhat kms:deploy-impl).
  • Call upgradeTo(address) on the existing KMS proxy.
  • No initializer is required for the KMS upgrade.
  • Important: if you want new apps to use the updated DstackApp implementation, deploy a new DstackApp implementation and call setAppImplementation(address) on the KMS proxy. Existing app proxies can be upgraded separately via their own upgradeTo (if upgrades are not disabled).

Testing

  • cd kms/auth-eth && npm test

@kvinwang kvinwang force-pushed the feature/app-tcb-toggle branch from 90140f9 to d660bc0 Compare February 9, 2026 02:32
@kvinwang kvinwang force-pushed the feature/app-tcb-toggle branch from d660bc0 to bfeaa5b Compare February 9, 2026 02:33
… remove KMS backward-compat factory

DstackApp:
- Extract shared init logic into _initializeCommon() internal function
- Add old 5-param initialize(address,bool,bool,bytes32,bytes32) overload
  for upgrade compatibility with existing proxies
- Keep 6-param initialize with requireTcbUpToDate for new deployments
- Add version() pure function returning 2 for capability detection

DstackKms:
- Remove 5-param deployAndRegisterApp backward-compat overload
- Remove _deployAndRegisterApp internal function
- Keep only 6-param deployAndRegisterApp with inline logic,
  callers must explicitly specify requireTcbUpToDate
…task

- deployContract() accepts optional initializer signature param,
  passed to hre.upgrades.deployProxy for overload disambiguation
- estimateDeploymentCost() accepts optional initializer param,
  passed to encodeFunctionData
- app:deploy task adds --requireTcbUpToDate flag and passes 6 args
  with explicit initializer signature
…tion

- setup.ts / DstackApp.test.ts: pass explicit initializer signature
  to deployContract and hre.upgrades.deployProxy
- DstackApp.test.ts: add version() test, add TCB-via-initialize tests
- DstackApp.upgrade.test.ts (new): 9 cases covering:
  - Old 5-param proxy upgrade preserves storage
  - version()=2 available after upgrade
  - requireTcbUpToDate defaults false (no silent behavior change)
  - setRequireTcbUpToDate works post-upgrade with access control
  - isAppAllowed TCB enforcement after owner opt-in
  - KMS factory deploys with TCB on/off, end-to-end verification
@Leechael
Copy link
Collaborator

Leechael commented Feb 11, 2026

Manual Upgrade Test Plan (anvil fork of Base)

Setup

Terminal 1 — start anvil:

anvil --fork-url https://mainnet.base.org

Terminal 2 — all commands below:

cd kms/auth-eth
export RPC=http://127.0.0.1:8545
export PK=0xdf57089febbacf7ba0bc227dafbffa9fc08a93fdc68e1e42411a14efcf23656e
export IMPL_SLOT=0x360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc

Phase 1: Deploy old version (master)

git checkout master
npx hardhat clean && npx hardhat compile

npx hardhat kms:deploy --with-app-impl --network test
# Record output, set env vars:
export KMS_PROXY=<DstackKms proxy address>
export OLD_APP_IMPL=<DstackApp implementation address>
export KMS_CONTRACT_ADDRESS=$KMS_PROXY

npx hardhat kms:create-app \
  --allow-any-device \
  --hash 0x0000000000000000000000000000000000000000000000000000000000000001 \
  --network test
# Record output, set env var:
export EXISTING_APP=<App proxy address>

Phase 2: Record pre-upgrade baseline

# Old version has no version(), expect revert
cast call $EXISTING_APP "version()(uint256)" --rpc-url $RPC

# Old version has no requireTcbUpToDate, expect revert
cast call $EXISTING_APP "requireTcbUpToDate()(bool)" --rpc-url $RPC

# Existing data should be present
cast call $EXISTING_APP "allowedComposeHashes(bytes32)(bool)" \
  0x0000000000000000000000000000000000000000000000000000000000000001 \
  --rpc-url $RPC
# Expected: true

cast call $EXISTING_APP "allowAnyDevice()(bool)" --rpc-url $RPC
# Expected: true

cast call $EXISTING_APP "owner()(address)" --rpc-url $RPC
# Expected: deployer address

# Record current implementation address (ERC1967 slot)
cast storage $EXISTING_APP $IMPL_SLOT --rpc-url $RPC
# Should match OLD_APP_IMPL (left-zero-padded to 32 bytes)

Phase 3: Switch to new code and upgrade

git checkout feature/app-tcb-toggle
npx hardhat clean && npx hardhat compile

# 3a. Deploy new DstackApp implementation
npx hardhat app:deploy-impl --network test
export NEW_APP_IMPL=<new DstackApp impl address>

# Verify old and new impl addresses differ
echo "OLD: $OLD_APP_IMPL"
echo "NEW: $NEW_APP_IMPL"
# These MUST be different; if identical, compilation is stale

# 3b. Deploy new DstackKms implementation and upgrade KMS proxy
npx hardhat kms:deploy-impl --network test
npx hardhat kms:upgrade --address $KMS_PROXY --network test

# 3c. Update KMS app implementation pointer
npx hardhat kms:set-app-implementation $NEW_APP_IMPL --network test

# 3d. Upgrade existing DstackApp proxy via direct UUPS call
#     (bypasses hardhat-upgrades plugin manifest issues with factory-deployed proxies)
cast send $EXISTING_APP "upgradeToAndCall(address,bytes)" \
  $NEW_APP_IMPL 0x \
  --private-key $PK --rpc-url $RPC

Phase 4: Post-upgrade verification

# Implementation address updated
cast storage $EXISTING_APP $IMPL_SLOT --rpc-url $RPC
# Expected: NEW_APP_IMPL (left-zero-padded to 32 bytes)

# version() = 2
cast call $EXISTING_APP "version()(uint256)" --rpc-url $RPC
# Expected: 2

# requireTcbUpToDate defaults to false (no silent behavior change)
cast call $EXISTING_APP "requireTcbUpToDate()(bool)" --rpc-url $RPC
# Expected: false

# Old data preserved
cast call $EXISTING_APP "allowedComposeHashes(bytes32)(bool)" \
  0x0000000000000000000000000000000000000000000000000000000000000001 \
  --rpc-url $RPC
# Expected: true

cast call $EXISTING_APP "allowAnyDevice()(bool)" --rpc-url $RPC
# Expected: true

cast call $EXISTING_APP "owner()(address)" --rpc-url $RPC
# Expected: deployer address

# Enable TCB check
cast send $EXISTING_APP "setRequireTcbUpToDate(bool)" true \
  --private-key $PK --rpc-url $RPC
cast call $EXISTING_APP "requireTcbUpToDate()(bool)" --rpc-url $RPC
# Expected: true

# Disable TCB check
cast send $EXISTING_APP "setRequireTcbUpToDate(bool)" false \
  --private-key $PK --rpc-url $RPC
cast call $EXISTING_APP "requireTcbUpToDate()(bool)" --rpc-url $RPC
# Expected: false

Phase 5: Verify new factory path (6-param with TCB flag)

# With TCB flag
npx hardhat kms:create-app \
  --require-tcb-up-to-date \
  --allow-any-device \
  --hash 0x0000000000000000000000000000000000000000000000000000000000000002 \
  --network test
export NEW_APP=<output address>

cast call $NEW_APP "version()(uint256)" --rpc-url $RPC
# Expected: 2
cast call $NEW_APP "requireTcbUpToDate()(bool)" --rpc-url $RPC
# Expected: true

# Without TCB flag
npx hardhat kms:create-app \
  --allow-any-device \
  --hash 0x0000000000000000000000000000000000000000000000000000000000000003 \
  --network test
export NEW_APP2=<output address>

cast call $NEW_APP2 "requireTcbUpToDate()(bool)" --rpc-url $RPC
# Expected: false

Phase 6: Verify backward-compatible 5-param factory path (old SDK callers)

This verifies that old SDK callers using the 5-param deployAndRegisterApp(address,bool,bool,bytes32,bytes32) still work against the upgraded KMS contract.

# Send tx via old 5-param overload, capture txhash with --json
TX_HASH=$(cast send $KMS_PROXY \
  "deployAndRegisterApp(address,bool,bool,bytes32,bytes32)" \
  $(cast wallet address --private-key $PK) \
  false \
  true \
  0x0000000000000000000000000000000000000000000000000000000000000000 \
  0x0000000000000000000000000000000000000000000000000000000000000005 \
  --private-key $PK --rpc-url $RPC --json | jq -r '.transactionHash')

# Extract app address from AppDeployedViaFactory event (topics[1])
# Event signature: AppDeployedViaFactory(address indexed appId, address indexed deployer)
# Topic[0] = 0xfd86d7f6962eba3b7a3bf9129c06c0b2f885e1c61ef2c9f0dbb856be0deefdee
export OLD_SDK_APP=$(cast receipt $TX_HASH --json --rpc-url $RPC \
  | jq -r '.logs[] | select(.topics[0] == "0xfd86d7f6962eba3b7a3bf9129c06c0b2f885e1c61ef2c9f0dbb856be0deefdee") | .topics[1]' \
  | sed 's/0x000000000000000000000000/0x/')

echo "Old SDK app deployed at: $OLD_SDK_APP"

cast call $OLD_SDK_APP "version()(uint256)" --rpc-url $RPC
# Expected: 2

cast call $OLD_SDK_APP "requireTcbUpToDate()(bool)" --rpc-url $RPC
# Expected: false (5-param overload defaults to false)

cast call $OLD_SDK_APP "allowAnyDevice()(bool)" --rpc-url $RPC
# Expected: true

Phase 7: Verify app:deploy standalone path

npx hardhat app:deploy \
  --require-tcb-up-to-date \
  --allow-any-device \
  --hash 0x0000000000000000000000000000000000000000000000000000000000000004 \
  --network test
export STANDALONE_APP=<output address>

cast call $STANDALONE_APP "version()(uint256)" --rpc-url $RPC
# Expected: 2
cast call $STANDALONE_APP "requireTcbUpToDate()(bool)" --rpc-url $RPC
# Expected: true

Checklist

# Check Expected
1 Old app version() before upgrade revert
2 Old app version() after upgrade 2
3 Old app requireTcbUpToDate() after upgrade false
4 Old data preserved (composeHash, allowAnyDevice, owner) unchanged
5 setRequireTcbUpToDate(true) success, reads back true
6 setRequireTcbUpToDate(false) success, reads back false
7 Factory 6-param + requireTcbUpToDate=true new app flag is true
8 Factory 6-param + requireTcbUpToDate=false new app flag is false
9 Factory 5-param (old SDK compat) new app, requireTcbUpToDate=false
10 app:deploy --require-tcb-up-to-date standalone deploy flag is true

Notes

  • Always npx hardhat clean && npx hardhat compile after every git checkout to avoid stale compilation artifacts
  • Factory-deployed proxies (via kms:create-app) are NOT in the hardhat-upgrades .openzeppelin manifest. Use cast send upgradeToAndCall directly instead of npx hardhat app:upgrade for these proxies
  • app:upgrade (via hardhat-upgrades plugin) works correctly for proxies deployed via app:deploy (which uses hre.upgrades.deployProxy)
  • The 5-param deployAndRegisterApp overload is kept for backward compatibility with existing SDK callers (e.g. phala-cloud SDKs using viem). It defaults requireTcbUpToDate=false

…rApp

Keep the old 5-param deployAndRegisterApp(address,bool,bool,bytes32,bytes32)
overload that defaults requireTcbUpToDate=false, so existing SDK callers
(e.g. phala-cloud SDKs using viem) continue to work without changes.

Fix ethers v6 ambiguous overload resolution by using explicit function
signatures in kms:create-app task and upgrade tests.
@Leechael
Copy link
Collaborator

Leechael commented Feb 11, 2026

Production Upgrade Safety Guide (Multisig)

Owner operations go through the multisig. Implementation deployments can be done by any EOA independently.

Pre-upgrade: Record all current implementation addresses

export IMPL_SLOT=0x360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc

# Record KMS current implementation
OLD_KMS_IMPL=$(cast storage $KMS_PROXY $IMPL_SLOT --rpc-url $RPC)
echo "KMS impl: $OLD_KMS_IMPL"

# Record each App proxy current implementation
OLD_APP_IMPL=$(cast storage $APP_PROXY $IMPL_SLOT --rpc-url $RPC)
echo "App impl: $OLD_APP_IMPL"

# Record current factory app implementation pointer
OLD_FACTORY_APP_IMPL=$(cast call $KMS_PROXY "appImplementation()(address)" --rpc-url $RPC)
echo "Factory app impl: $OLD_FACTORY_APP_IMPL"

Save these values offline before proceeding. They are your rollback targets.


Phase A: Deploy new implementations (any EOA, no multisig needed)

Implementation contracts are just bytecode on chain — deploying them has zero effect on existing proxies. Any team member with an EOA and gas can do this independently, ahead of the multisig signing session.

npx hardhat app:deploy-impl --network <network>
# Record: NEW_APP_IMPL=<address>

npx hardhat kms:deploy-impl --network <network>
# Record: NEW_KMS_IMPL=<address>

Verify the deployed code is correct:

# Should return 2
cast call $NEW_APP_IMPL "version()(uint256)" --rpc-url $RPC

# Confirm implementation addresses differ from current
echo "Old app impl: $OLD_APP_IMPL"
echo "New app impl: $NEW_APP_IMPL"

These addresses can sit on chain indefinitely until the multisig is ready to proceed.

Phase B: Full dry run on anvil fork (any team member)

Run the complete Manual Upgrade Test Plan (see comment above) on a fork. This also requires no multisig.


Phase C: Multisig operations (requires signature collection)

Pre-encode ALL calldata (including rollback) before starting the signing session.

Tx #1: Upgrade KMS proxy

cast calldata "upgradeToAndCall(address,bytes)" $NEW_KMS_IMPL 0x

Submit to Safe UI:

  • To: $KMS_PROXY
  • Value: 0
  • Data: paste the output above

After execution, verify:

cast storage $KMS_PROXY $IMPL_SLOT --rpc-url $RPC
# Must match NEW_KMS_IMPL

cast call $KMS_PROXY "appImplementation()(address)" --rpc-url $RPC
# Must still return OLD_FACTORY_APP_IMPL (unchanged)

Tx #2: Update factory app implementation pointer

cast calldata "setAppImplementation(address)" $NEW_APP_IMPL

Submit to Safe UI:

  • To: $KMS_PROXY
  • Value: 0
  • Data: paste the output above

This only affects NEW apps created via factory. Existing apps are untouched.

Tx #3: Canary — upgrade ONE non-critical app proxy

cast calldata "upgradeToAndCall(address,bytes)" $NEW_APP_IMPL 0x

Submit to Safe UI:

  • To: $CANARY_APP (the app proxy address)
  • Value: 0
  • Data: paste the output above

Verify:

cast call $CANARY_APP "version()(uint256)" --rpc-url $RPC
# Expected: 2

cast call $CANARY_APP "requireTcbUpToDate()(bool)" --rpc-url $RPC
# Expected: false

cast call $CANARY_APP "owner()(address)" --rpc-url $RPC
# Expected: multisig address (unchanged)

Tx #4+: Upgrade remaining app proxies

Repeat tx #3 for each app. Can be batched in a single multisig approval using Safe Transaction Builder.


Rollback commands (multisig transactions)

Important: Rollback through multisig requires collecting signatures, which takes time. This is why the incremental sequence (tx #1#2#3#4) is critical — verify each step before proceeding.

Pre-encode these before starting the upgrade so they're ready if needed:

Rollback KMS proxy:

cast calldata "upgradeToAndCall(address,bytes)" $OLD_KMS_IMPL 0x
# Submit to Safe: To=$KMS_PROXY, Value=0, Data=<output>

Rollback App proxy:

cast calldata "upgradeToAndCall(address,bytes)" $OLD_APP_IMPL 0x
# Submit to Safe: To=$APP_PROXY, Value=0, Data=<output>

Rollback factory pointer:

cast calldata "setAppImplementation(address)" $OLD_FACTORY_APP_IMPL
# Submit to Safe: To=$KMS_PROXY, Value=0, Data=<output>

What cannot be rolled back

If a DstackApp owner has previously called disableUpgrades(), that app can never be upgraded. This is by design and will not happen during this upgrade.

Post-upgrade behavior guarantees

  • requireTcbUpToDate defaults to false for all upgraded apps — no silent behavior change
  • Existing allowedComposeHashes, allowedDeviceIds, allowAnyDevice, owner — all preserved
  • Old 5-param deployAndRegisterApp ABI continues to work (backward compatible)
  • App owners must explicitly call setRequireTcbUpToDate(true) to opt in to TCB enforcement

Checklist before signing session

  • New implementation addresses recorded (NEW_APP_IMPL, NEW_KMS_IMPL)
  • Old implementation addresses recorded (OLD_APP_IMPL, OLD_KMS_IMPL, OLD_FACTORY_APP_IMPL)
  • Anvil fork dry run passed all checks
  • All calldata pre-encoded (upgrade + rollback)
  • cast call $NEW_APP_IMPL "version()(uint256)" returns 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants