DPE-4200 Scale-up from zero #414

paulomach · 2024-05-13T15:10:26Z

Issue

O scaling to zero, the last unit ensure the cluster is dissolved.
Scale back to unit > 0 don't detect and treat the dissolved cluster.

Solution

detect and treat special case of dissolved cluster

Fixes #409

dpe-4200-recover-scale-up-from-zero

carlcsaposs-canonical

we might need to change some things on router to support scaling mysql-k8s to 0 and back up

carlcsaposs-canonical · 2024-06-17T07:26:08Z

src/charm.py

+        _default_unit_data_keys = {
+            "egress-subnets",
+            "ingress-address",
+            "private-address",
+            "unit-status",
+        }
+        return self.unit_peer_data.keys() == _default_unit_data_keys


nit: this test might be flaky if the default keys are ever changed by Juju
consider writing something to persistent disk or databag & checking that

can we utilize units-added-to-cluster here instead of relying on these keys?

IMHO, we can rely only on data on the storage (it can be foreign storage by user misatke, which we should not damage/erase).

carlcsaposs-canonical · 2024-06-17T07:26:56Z

src/charm.py

+                if self.unit.is_leader():
+                    # create the cluster due it being dissolved on scale-down
+                    self.create_cluster()


if there's a cluster that already exists on persistent disk, will that be re-used?

IMHO, charm must be blocked if foreign disk attached by mistake (to avoid data damage).

Today it is not possible for K8s, but Pedro is working with Juju on it:

> juju scale-application mysql 0 mysql scaled to 0 units > juju add-storage mysql/0 database=foreigndisk ERROR Juju command "add-storage" not supported on container models # will be supported on K8s like on VM > juju scale-application mysql 3 ...

Another reason to write to disk:

> juju scale-application mysql 0 # juju controller crashed... restored ... > juju scale-application mysql 1 # did we loose units-added-to-cluster ?

shayancanonical · 2024-06-17T12:44:33Z

src/charm.py

+        if self.unit.is_leader():
+            # Update 'units-added-to-cluster' counter in the peer relation databag
+            units = int(self.app_peer_data.get("units-added-to-cluster", 1))
+            self.app_peer_data["units-added-to-cluster"] = str(units - 1)


should we also have this update to units-added-to-cluster on the peer-relation-departed event? wouldnt we like to decrement this value if non-leader units are scaled down?

or is this happening on update-status (in particular _set_app_status)? if this is the case, would there be inconsistency issues -> there are 3 units, if another unit departed and then the leader unit departed, units-added-to-cluster will be 2 instead of 1 (if storage-detaching runs on leader before update-status)

shayancanonical · 2024-06-17T12:48:43Z

src/charm.py

+        _default_unit_data_keys = {
+            "egress-subnets",
+            "ingress-address",
+            "private-address",
+            "unit-status",
+        }
+        return self.unit_peer_data.keys() == _default_unit_data_keys


can we utilize units-added-to-cluster here instead of relying on these keys?

taurus-forever

LGTM, but IMHO, better to discuss corner cases before merging.

taurus-forever · 2024-06-17T15:35:59Z

src/charm.py

+        _default_unit_data_keys = {
+            "egress-subnets",
+            "ingress-address",
+            "private-address",
+            "unit-status",
+        }
+        return self.unit_peer_data.keys() == _default_unit_data_keys


IMHO, we can rely only on data on the storage (it can be foreign storage by user misatke, which we should not damage/erase).

taurus-forever · 2024-06-17T15:42:20Z

src/charm.py

+                if self.unit.is_leader():
+                    # create the cluster due it being dissolved on scale-down
+                    self.create_cluster()


IMHO, charm must be blocked if foreign disk attached by mistake (to avoid data damage).

Today it is not possible for K8s, but Pedro is working with Juju on it:

> juju scale-application mysql 0 mysql scaled to 0 units > juju add-storage mysql/0 database=foreigndisk ERROR Juju command "add-storage" not supported on container models # will be supported on K8s like on VM > juju scale-application mysql 3 ...

Another reason to write to disk:

> juju scale-application mysql 0 # juju controller crashed... restored ... > juju scale-application mysql 1 # did we loose units-added-to-cluster ?

taurus-forever · 2024-06-17T15:49:24Z

BTW, we are adding the test to check scale-down to zero and restore with foreign disk here.

Maybe it worth to copy it into MySQL VM/K8s as well?

paulomach · 2024-06-17T19:35:09Z

we might need to change some things on router to support scaling mysql-k8s to 0 and back up

Testing it, it requires a remove and re-relate. On scaling in to zero, routers are cleaned up from the metadata already.
Messaging in the router is very clear, imo I think that's adequate behavior.

taurus-forever · 2024-06-18T08:15:48Z

re: it requires a remove and re-relate

Fields/customers do not like this... can we avoid re-relations without the real over-complication in code?

sombrafam · 2024-09-10T13:32:46Z

@paulomach hi Paulo, I see that the pool request was merged. How can I know the release version this will be published if it's not already?

paulomach · 2024-09-11T19:49:46Z

Hey @sombrafam , this still on the discussion. There are some edge cases we need to fix.

paulomach added 5 commits May 13, 2024 17:06

(WIP) detect and treat scale up from zero

dd73470

scale up from zero

3a67c9a

Merge branch 'main' of github.com:canonical/mysql-k8s-operator into fix/

3f6501e

dpe-4200-recover-scale-up-from-zero

bump outdated lib

0310c06

Merge branch 'main' of github.com:canonical/mysql-k8s-operator into fix/

f99d37e

dpe-4200-recover-scale-up-from-zero

paulomach marked this pull request as ready for review June 14, 2024 20:03

paulomach requested review from shayancanonical, carlcsaposs-canonical and taurus-forever June 14, 2024 20:03

carlcsaposs-canonical approved these changes Jun 17, 2024

View reviewed changes

shayancanonical reviewed Jun 17, 2024

View reviewed changes

taurus-forever approved these changes Jun 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DPE-4200 Scale-up from zero #414

DPE-4200 Scale-up from zero #414

paulomach commented May 13, 2024

carlcsaposs-canonical left a comment

carlcsaposs-canonical Jun 17, 2024

shayancanonical Jun 17, 2024

taurus-forever Jun 17, 2024

carlcsaposs-canonical Jun 17, 2024

taurus-forever Jun 17, 2024

shayancanonical Jun 17, 2024

shayancanonical Jun 17, 2024

taurus-forever left a comment

taurus-forever Jun 17, 2024

taurus-forever Jun 17, 2024

taurus-forever commented Jun 17, 2024

paulomach commented Jun 17, 2024

taurus-forever commented Jun 18, 2024

sombrafam commented Sep 10, 2024

paulomach commented Sep 11, 2024

DPE-4200 Scale-up from zero #414

Are you sure you want to change the base?

DPE-4200 Scale-up from zero #414

Conversation

paulomach commented May 13, 2024

Issue

Solution

carlcsaposs-canonical left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

taurus-forever left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

taurus-forever commented Jun 17, 2024

paulomach commented Jun 17, 2024

taurus-forever commented Jun 18, 2024

sombrafam commented Sep 10, 2024

paulomach commented Sep 11, 2024