Skip to content

test: mobile benchmarking for proving and witness generation using BrowserStack and mobench#398

Open
dcbuild3r wants to merge 26 commits intomainfrom
dcbuild3r/mobench-zk-benchmarks
Open

test: mobile benchmarking for proving and witness generation using BrowserStack and mobench#398
dcbuild3r wants to merge 26 commits intomainfrom
dcbuild3r/mobench-zk-benchmarks

Conversation

@dcbuild3r
Copy link
Collaborator

@dcbuild3r dcbuild3r commented Feb 11, 2026

The goal of this PR is to test mobench, a Rust benchmarking library for benchmarking Rust functions on mobile environments using BrowserStack. BrowserStack has an automated mobile testing API called App Automate which allows you to send .ipa or .apk files to live devices that will install the apps and run whatever automation is in them. In our case we are using XCUITest for iOS and Espresso for Android. The mobench crate uses UniFFI, a Rust bindings generator, to create Swift and Kotlin bindings for the Rust functions and the benchmarking harness around them. The C ABI output bindings are put in Kotlin/Swift app templates which render a UI where the benchmarks are shown. A recording can also be seen on BrowserStack. The main output is a results.json with all benchmarking information. Besides performance, BrowserStack also records resource usage and metrics, e.g. RAM and CPU.

Summary

  • add mobile benchmark CI workflows for iOS + Android (mobile-bench-ios.yml) with configurable inputs (platforms, proof_scope, modes, iterations, warmup, device_profile, custom overrides, mobench_ref)
  • add PR command dispatcher (mobile-bench-pr-command.yml) to run benchmarks via /mobench ... comments
  • keep legacy label workflow present but disabled (mobile-bench-pr-label.yml) to avoid per-commit benchmark auto-runs
  • add natural-language CI reporting (bench-mobile/scripts/summarize_mobench_ci.py) with at-a-glance scorecard in workflow summary and sticky PR comment support
  • add detailed CI docs (bench-mobile/docs/ci-pipeline-detailed.md) and refresh benchmark docs
  • colocate benchmark config/device matrix templates under bench-mobile/:
    • bench-mobile/bench-config.toml
    • bench-mobile/bench-config.ios.toml
    • bench-mobile/bench-config.android.toml
    • bench-mobile/device-matrix.yaml
    • bench-mobile/device-matrix.ios.low-spec.yaml
    • bench-mobile/device-matrix.android.low-spec.yaml

Notes

  • /mobench comment dispatch is collaborator-restricted (OWNER|MEMBER|COLLABORATOR) and blocks fork PR dispatch
  • benchmark workflow defaults to worldcoin/mobile-bench-rs@codex/ci-devex and allows override via mobench_ref
  • runtime iOS/Android benchmark configs are generated in bench-mobile/ and uploaded in artifacts
  • GitHub evaluates issue_comment workflows from default branch; command dispatch requires mobile-bench-pr-command.yml to exist on main

Validation

  • latest PR mobile benchmark run passed for both Android and iOS
  • benchmark workflow executes full BrowserStack matrix and uploads results artifacts
  • branch changes are scoped to benchmark workflows and bench-mobile/ benchmarking assets/docs/code

Comment on lines 651 to 718
run: |
set -euo pipefail

scope="${PROOF_SCOPE,,}"
modes="${BENCH_MODES,,}"
modes="${modes//[[:space:]]/}"
if [[ -z "$modes" ]]; then
modes="all"
fi

case "$scope" in
both|pi1|pi2) ;;
*)
echo "::error::Invalid proof_scope: '$scope' (expected: both|pi1|pi2)"
exit 1
;;
esac

mode_enabled() {
local mode="$1"
[[ "$modes" == "all" ]] && return 0
[[ ",$modes," == *",$mode,"* ]]
}

scope_enabled() {
local bench_scope="$1"
[[ "$scope" == "both" || "$scope" == "$bench_scope" ]]
}

mkdir -p target/mobench/ci/android

benches=(
"pi2 witness bench_mobile::bench_nullifier_witness_generation_only nullifier-witness"
"pi2 proving bench_mobile::bench_nullifier_proving_only nullifier-proving"
"pi2 full bench_mobile::bench_nullifier_proof_generation nullifier-full"
"pi1 witness bench_mobile::bench_query_witness_generation_only query-witness"
"pi1 proving bench_mobile::bench_query_proving_only query-proving"
"pi1 full bench_mobile::bench_query_proof_generation query-full"
)

selected=0
for bench in "${benches[@]}"; do
read -r bench_scope bench_mode function output <<<"$bench"
if ! scope_enabled "$bench_scope"; then
continue
fi
if ! mode_enabled "$bench_mode"; then
continue
fi
selected=$((selected + 1))
cargo mobench run \
--target android \
--function "${function}" \
--iterations "${{ inputs.iterations }}" \
--warmup "${{ inputs.warmup }}" \
--config bench-config.android.runtime.toml \
--release \
--fetch \
--fetch-timeout-secs "${{ inputs.fetch_timeout_secs }}" \
--summary-csv \
--output "target/mobench/ci/android/${output}.json"
done

if [[ "$selected" -eq 0 ]]; then
echo "::error::No Android benchmarks selected by proof_scope='${scope}' and modes='${modes}'."
exit 1
fi

Copy link

@semgrep-code-worldcoin semgrep-code-worldcoin bot Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".

🌟 Fixed in commit cd8f408 🌟

Comment on lines 624 to 645
run: |
cat > device-matrix.android.runtime.yaml <<EOF
devices:
- name: "${{ steps.resolve_android.outputs.device_name }}"
os: "android"
os_version: "${{ steps.resolve_android.outputs.os_version }}"
tags: ["runtime", "android"]
EOF
cat > bench-config.android.runtime.toml <<EOF
target = "android"
function = "bench_mobile::bench_nullifier_proving_only"
iterations = ${{ inputs.iterations }}
warmup = ${{ inputs.warmup }}
device_matrix = "device-matrix.android.runtime.yaml"
device_tags = ["runtime"]

[browserstack]
app_automate_username = "\${BROWSERSTACK_USERNAME}"
app_automate_access_key = "\${BROWSERSTACK_ACCESS_KEY}"
project = "mobile-bench-rs"
EOF

Copy link

@semgrep-code-worldcoin semgrep-code-worldcoin bot Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".

🚀 Fixed in commit cd8f408 🚀

Comment on lines 366 to 433
run: |
set -euo pipefail

scope="${PROOF_SCOPE,,}"
modes="${BENCH_MODES,,}"
modes="${modes//[[:space:]]/}"
if [[ -z "$modes" ]]; then
modes="all"
fi

case "$scope" in
both|pi1|pi2) ;;
*)
echo "::error::Invalid proof_scope: '$scope' (expected: both|pi1|pi2)"
exit 1
;;
esac

mode_enabled() {
local mode="$1"
[[ "$modes" == "all" ]] && return 0
[[ ",$modes," == *",$mode,"* ]]
}

scope_enabled() {
local bench_scope="$1"
[[ "$scope" == "both" || "$scope" == "$bench_scope" ]]
}

mkdir -p target/mobench/ci/ios

benches=(
"pi2 witness bench_mobile::bench_nullifier_witness_generation_only nullifier-witness"
"pi2 proving bench_mobile::bench_nullifier_proving_only nullifier-proving"
"pi2 full bench_mobile::bench_nullifier_proof_generation nullifier-full"
"pi1 witness bench_mobile::bench_query_witness_generation_only query-witness"
"pi1 proving bench_mobile::bench_query_proving_only query-proving"
"pi1 full bench_mobile::bench_query_proof_generation query-full"
)

selected=0
for bench in "${benches[@]}"; do
read -r bench_scope bench_mode function output <<<"$bench"
if ! scope_enabled "$bench_scope"; then
continue
fi
if ! mode_enabled "$bench_mode"; then
continue
fi
selected=$((selected + 1))
cargo mobench run \
--target ios \
--function "${function}" \
--iterations "${{ inputs.iterations }}" \
--warmup "${{ inputs.warmup }}" \
--config bench-config.ios.runtime.toml \
--release \
--fetch \
--fetch-timeout-secs "${{ inputs.fetch_timeout_secs }}" \
--summary-csv \
--output "target/mobench/ci/ios/${output}.json"
done

if [[ "$selected" -eq 0 ]]; then
echo "::error::No iOS benchmarks selected by proof_scope='${scope}' and modes='${modes}'."
exit 1
fi

Copy link

@semgrep-code-worldcoin semgrep-code-worldcoin bot Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".

🚀 Fixed in commit cd8f408 🚀

Comment on lines 335 to 360
run: |
cat > device-matrix.ios.runtime.yaml <<EOF
devices:
- name: "${{ steps.resolve_ios.outputs.device_name }}"
os: "ios"
os_version: "${{ steps.resolve_ios.outputs.os_version }}"
tags: ["runtime", "ios"]
EOF
cat > bench-config.ios.runtime.toml <<EOF
target = "ios"
function = "bench_mobile::bench_nullifier_proving_only"
iterations = ${{ inputs.iterations }}
warmup = ${{ inputs.warmup }}
device_matrix = "device-matrix.ios.runtime.yaml"
device_tags = ["runtime"]

[browserstack]
app_automate_username = "\${BROWSERSTACK_USERNAME}"
app_automate_access_key = "\${BROWSERSTACK_ACCESS_KEY}"
project = "mobile-bench-rs"

[ios_xcuitest]
app = "target/mobench/ios/BenchRunner.ipa"
test_suite = "target/mobench/ios/BenchRunnerUITests.zip"
EOF

Copy link

@semgrep-code-worldcoin semgrep-code-worldcoin bot Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".

🧼 Fixed in commit cd8f408 🧼

@dcbuild3r dcbuild3r changed the title ci: add dual-platform BrowserStack mobench benchmark automation feature: mobile benchmarking for proving and witness generation using BrowserStack and mobench Feb 11, 2026
@dcbuild3r dcbuild3r changed the title feature: mobile benchmarking for proving and witness generation using BrowserStack and mobench feature test: mobile benchmarking for proving and witness generation using BrowserStack and mobench Feb 11, 2026
@dcbuild3r dcbuild3r force-pushed the dcbuild3r/mobench-zk-benchmarks branch from ba76f1d to 16225b7 Compare February 11, 2026 10:29
@dcbuild3r dcbuild3r changed the title feature test: mobile benchmarking for proving and witness generation using BrowserStack and mobench ci(bench-mobile): harden mobile benchmark workflow and fix CI checks Feb 11, 2026
@dcbuild3r dcbuild3r force-pushed the dcbuild3r/mobench-zk-benchmarks branch from cd8f408 to c9cfd8d Compare February 11, 2026 17:41
@dcbuild3r dcbuild3r changed the title ci(bench-mobile): harden mobile benchmark workflow and fix CI checks test: mobile benchmarking for proving and witness generation using BrowserStack and mobench Feb 11, 2026
@dcbuild3r
Copy link
Collaborator Author

/mobench platforms=both iterations=30 warmup=5 proof_scope=both modes=all device_profile=auto-low-spec

@dcbuild3r dcbuild3r added the mobench Trigger mobile benchmark workflow for PR label Feb 12, 2026
@dcbuild3r dcbuild3r force-pushed the dcbuild3r/mobench-zk-benchmarks branch from 2eb9e35 to 4a2a8e9 Compare February 12, 2026 21:31
@dcbuild3r dcbuild3r added mobench Trigger mobile benchmark workflow for PR and removed mobench Trigger mobile benchmark workflow for PR labels Feb 13, 2026
@dcbuild3r
Copy link
Collaborator Author

/mobench platforms=both proof_scope=both modes=all device_profile=auto-low-spec iterations=30 warmup=5 fetch_timeout_secs=1800

@dcbuild3r
Copy link
Collaborator Author

Temporarily closing to retrigger mobile-bench checks after CI fix; reopening immediately.

@dcbuild3r dcbuild3r closed this Feb 13, 2026
@dcbuild3r dcbuild3r reopened this Feb 13, 2026
@dcbuild3r dcbuild3r added mobench Trigger mobile benchmark workflow for PR and removed mobench Trigger mobile benchmark workflow for PR labels Feb 16, 2026
Comment on lines +998 to +1015
run: |
set -euo pipefail
python3 world-id-protocol/bench-mobile/scripts/summarize_mobench_ci.py \
--ios-dir artifacts/ios \
--android-dir artifacts/android \
--ios-result "${IOS_RESULT}" \
--android-result "${ANDROID_RESULT}" \
--platforms "${{ inputs.platforms }}" \
--proof-scope "${{ inputs.proof_scope }}" \
--modes "${{ inputs.modes }}" \
--device-profile "${{ inputs.device_profile }}" \
--mobench-ref "${{ inputs.mobench_ref }}" \
--run-url "${RUN_URL}" \
--pr-number "${{ inputs.pr_number }}" \
--requested-by "${{ inputs.requested_by }}" \
--request-command "${{ inputs.request_command }}" \
--output /tmp/mobench-summary.md

Copy link

@semgrep-code-worldcoin semgrep-code-worldcoin bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".

🌟 Removed in commit 27ca130 🌟

@dcbuild3r
Copy link
Collaborator Author

/mobench platforms=both iterations=30 warmup=5 proof_scope=both modes=all device_profile=auto-low-spec mobench_ref=codex/ci-devex

@dcbuild3r dcbuild3r marked this pull request as ready for review February 16, 2026 16:55
@dcbuild3r dcbuild3r requested a review from a team as a code owner February 16, 2026 16:55
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3db5993494

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

inventory = "0.3"

# World ID
world-id-core = { workspace = true, default-features = false, features = ["authenticator", "embed-zkeys", "issuer"] }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Register bench-mobile in workspace members

bench-mobile inherits its core dependencies via workspace = true, but this commit never adds the crate to [workspace].members in the root Cargo.toml. For a package under the workspace root, Cargo rejects workspace inheritance unless it is a workspace member, so the new benchmark entry points (cargo run -p bench-mobile and CI mobench invocations) fail before any benchmark executes.

Useful? React with 👍 / 👎.

thiserror = { workspace = true }

# UniFFI for mobile bindings
uniffi = { workspace = true, features = ["cli"] }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Define uniffi in workspace dependencies

This manifest declares uniffi = { workspace = true }, but the workspace root has no workspace.dependencies.uniffi entry, so Cargo cannot resolve this inherited dependency and fails to parse/build the new crate. As written, the mobile benchmark crate is not buildable until uniffi is added to workspace dependencies or pinned directly here.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mobench Trigger mobile benchmark workflow for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant