Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rabbitmq_ct_helpers: Use node 2 as the cluster seed node #13099

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dumbbell
Copy link
Member

@dumbbell dumbbell commented Jan 20, 2025

Why

When running mixed-version tests, nodes 1/3/5/... are using the primary umbrella, so usually the newest version. Nodes 2/4/6/... are using the secondary umbrella, thus the old version.

When clustering, we used to use node 1 (running a new version) as the seed node, meaning other nodes would join it.

This complicates things with feature flags because we have to make sure that we start node 1 with new stable feature flags disabled to allow old nodes to join.

This is also a problem with Khepri machine versions because the cluster would start with the latest version, which old nodes might not have.

How

This patch changes the logic to use a node running the secondary umbrella as the seed node instead. If there is no node running it, we pick the first node as before.

V2: Revert part of "rabbitmq_ct_helpers: Fix how we set $RABBITMQ_FEATURE_FLAGS in tests" (commit 57ed962 from #13077). These changes are no longer needed with the new logic.

V3: The check that verifies that the correct metadata store is used has a special case for nodes that use the secondary umbrella: if Khepri is supposed to be used but it's not, the feature flag is enabled. The reason is that the v4.0.x branch doesn't know about the rel configuration of forced_feature_flags_on_init. The nodes will have ignored this parameter and booted with the stable feature flags only.

Many testsuites are adapted to the new clustering order. If they manage which node joins which node, either the order is changed in the testcases, or nodes are started with only required feature flags. For testsuites that rely on peer discovery where the order is unknown, nodes are started with only required feature flags.

@dumbbell dumbbell self-assigned this Jan 20, 2025
@dumbbell dumbbell force-pushed the use-node2-as-cluster-seed-node-in-ci branch from 412addd to 7d9ff2d Compare January 20, 2025 11:16
@dumbbell dumbbell changed the title rabbitmq_ct_broker_helpers: Use node 2 as the cluster seed node rabbitmq_ct_helpers: Use node 2 as the cluster seed node Jan 20, 2025
@dumbbell dumbbell force-pushed the use-node2-as-cluster-seed-node-in-ci branch 3 times, most recently from a0c969c to de11406 Compare January 24, 2025 10:13
dumbbell and others added 2 commits January 24, 2025 16:36
[Why]
When running mixed-version tests, nodes 1/3/5/... are using the primary
umbrella, so usually the newest version. Nodes 2/4/6/... are using the
secondary umbrella, thus the old version.

When clustering, we used to use node 1 (running a new version) as the
seed node, meaning other nodes would join it.

This complicates things with feature flags because we have to make sure
that we start node 1 with new stable feature flags disabled to allow old
nodes to join.

This is also a problem with Khepri machine versions because the cluster
would start with the latest version, which old nodes might not have.

[How]
This patch changes the logic to use a node running the secondary
umbrella as the seed node instead. If there is no node running it, we
pick the first node as before.

V2: Revert part of "rabbitmq_ct_helpers: Fix how we set
    `$RABBITMQ_FEATURE_FLAGS` in tests" (commit
    57ed962). These changes are no
    longer needed with the new logic.

V3: The check that verifies that the correct metadata store is used has
    a special case for nodes that use the secondary umbrella: if Khepri
    is supposed to be used but it's not, the feature flag is enabled.
    The reason is that the `v4.0.x` branch doesn't know about the `rel`
    configuration of `forced_feature_flags_on_init`. The nodes will
    have ignored thies parameter and booted with the stable feature
    flags only.

    Many testsuites are adapted to the new clustering order. If they
    manage which node joins which node, either the order is changed in
    the testcases, or nodes are started with only required feature
    flags. For testsuites that rely on peer discovery where the order is
    unknown, nodes are started with only required feature flags.
@dumbbell dumbbell force-pushed the use-node2-as-cluster-seed-node-in-ci branch from de11406 to 2ce283d Compare January 24, 2025 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants