Skip to content

Conversation

@nammn
Copy link
Collaborator

@nammn nammn commented Jan 26, 2026

Summary

OpenShift E2E tests were forced to run sequentially (max_hosts: 1) because multiple operator instances on the same cluster would conflict on the ValidatingWebhookConfiguration, a cluster-scoped resource with a hardcoded name mdbpolicy.mongodb.com.


Changes to Enable Parallel Execution

1. Configurable Webhook Name

Why: The ValidatingWebhookConfiguration is cluster-scoped. Multiple operators would overwrite each other.

What: Made webhook name configurable via MDB_WEBHOOK_NAME env var and operator.webhook.name helm value. OpenShift context files set namespace-unique names (e.g., mdbpolicy.${NAMESPACE}.mongodb.com).

2. Controlled Parallelism with Retries

Why: Unlimited parallelism caused pod scheduling failures due to cluster resource contention.

What: Set max_hosts: 2 per variant. Added retry mechanism via <<: *teardown_group and removed OpenShift from retry exclusion list.


Proof of Work

https://spruce.mongodb.com/version/6977bceda7da920007894d34/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC -> were running in parallel

Checklist

  • Have you linked a jira ticket and/or is the ticket in the title?
  • Have you checked whether your jira ticket required DOCSP changes?
  • Have you added changelog file?
    • use skip-changelog label if not needed

@github-actions
Copy link

github-actions bot commented Jan 26, 2026

⚠️ (this preview might not be accurate if the PR is not rebased on current master branch)

MCK 1.7.0 Release Notes

New Features

  • Allows users to override any Ops Manager emptyDir mount with their own PVCs via overrides statefulSet.spec.volumeClaimTemplates.
  • Added support for auto embeddings in MongoDB Community to automatically generate vector embeddings for the vector search data. This document can be followed for detailed documentation
  • MongoDBSearch: Updated the default mongodb/mongodb-search image version to 0.60.1. This is the version MCK uses if .spec.version is not specified.
  • Added support for configurable ValidatingWebhookConfiguration name via operator.webhook.name helm value.

Bug Fixes

  • Fix an issue to ensure that hosts are consistently removed from Ops Manager monitoring during AppDB scale-down events.
  • Fixed an issue where monitoring agents would fail after disabling TLS on a MongoDB deployment.
  • Persistent Volume Claim resize fix: Fixed an issue where the Operator ignored namespaces when listing PVCs, causing conflicts with resizing PVCs of the same name. Now, PVCs are filtered by both name and namespace for accurate resizing.
  • Fixed a panic that occurred when the domain names for a horizon was empty. Now, if the domain names are not valid (RFC 1123), the validation will fail before reconciling.
  • MongoDBMultiCluster, MongoDB: Fix an issue where the operator skipped host removal when an external domain was used, leaving monitoring hosts in Ops Manager even after workloads were correctly removed from the cluster.
  • Fixed an issue where the Operator could crash when TLS certificates are configured using the certificatesSecretsPrefix field without additional TLS settings.
  • MongoDBOpsManager, AppDB: Block removing a member cluster while it still has non-zero members. This prevents scaling down without the preserved configuration and avoids unexpected issues.

@nammn nammn force-pushed the feature/configurable-webhook-name branch 7 times, most recently from 333a386 to 42f4a1c Compare January 26, 2026 15:34
# OM tests are also run on the same Openshift cluster, so we use 1 max host to not
# allow the helm installations to interfere with each other during the setup of the tests.
max_hosts: 1
max_hosts: 2
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this enables concurrent re-runs as well as independent retries

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but, we can't set a higher number as we are still limited by openshift

@nammn nammn force-pushed the feature/configurable-webhook-name branch from 42f4a1c to b8b4905 Compare January 26, 2026 16:02
- Add configurable webhook name via MDB_WEBHOOK_NAME env var and operator.webhook.name helm value
- Add support for MDB_OPERATOR_NAME env var to set operator.name helm value
- Set namespace-unique operator and webhook names in OpenShift context files
- Remove max_hosts constraint from OpenShift task group to allow parallel execution

The ValidatingWebhookConfiguration name was previously hardcoded to 'mdbpolicy.mongodb.com',
causing conflicts when multiple operator instances run on the same cluster. This change
allows each instance to use a unique webhook name while maintaining backward compatibility
(default remains 'mdbpolicy.mongodb.com' when not explicitly set).
@nammn nammn force-pushed the feature/configurable-webhook-name branch from b8b4905 to 1b2fc53 Compare January 26, 2026 16:17
@nammn nammn marked this pull request as ready for review January 27, 2026 08:09
@nammn nammn requested review from a team and vinilage as code owners January 27, 2026 08:09
@nammn nammn enabled auto-merge (squash) January 28, 2026 15:13
@nammn nammn merged commit 4de136f into master Jan 28, 2026
5 of 6 checks passed
@nammn nammn deleted the feature/configurable-webhook-name branch January 28, 2026 15:21
nammn added a commit that referenced this pull request Jan 28, 2026
The sequential dependency was added as a workaround for webhook name
conflicts between parallel OpenShift tests. PR #721 solved this by
making webhook names namespace-unique, so the dependency is no longer
needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants