You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently there is no way we found to reproduce it. Please see the bundle attached. bundle.yaml.txt
Expected behavior
Mysql is able to get cluster endpoints
Actual behavior
Mysql is failing to get endpoints from cluster status
unit-kfp-db-0: 01:36:15 ERROR unit.kfp-db/0.juju-log database-peers:6: Failed to get cluster status for kfp-db-cluster
unit-kfp-db-0: 01:36:15 ERROR unit.kfp-db/0.juju-log database-peers:6: Failed to get cluster endpoints
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-kfp-db-0/charm/src/mysql_k8s_helpers.py", line 786, in update_endpoints
rw_endpoints, ro_endpoints, offline = self.get_cluster_endpoints(get_ips=False)
File "/var/lib/juju/agents/unit-kfp-db-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 724, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-kfp-db-0/charm/lib/charms/mysql/v0/mysql.py", line 1872, in get_cluster_endpoints
raise MySQLGetClusterEndpointsError("Failed to get endpoints from cluster status")
charms.mysql.v0.mysql.MySQLGetClusterEndpointsError: Failed to get endpoints from cluster status
unit-kfp-db-0: 01:36:16 INFO juju.worker.uniter.operation ran "database-peers-relation-changed" hook (via hook dispatching script: dispatch)
unit-kfp-db-0: 01:36:17 ERROR unit.kfp-db/0.juju-log database-peers:6: Failed to get cluster status for kfp-db-cluster
unit-kfp-db-0: 01:36:17 ERROR unit.kfp-db/0.juju-log database-peers:6: Failed to get cluster endpoints
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-kfp-db-0/charm/src/mysql_k8s_helpers.py", line 786, in update_endpoints
rw_endpoints, ro_endpoints, offline = self.get_cluster_endpoints(get_ips=False)
File "/var/lib/juju/agents/unit-kfp-db-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 724, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-kfp-db-0/charm/lib/charms/mysql/v0/mysql.py", line 1872, in get_cluster_endpoints
raise MySQLGetClusterEndpointsError("Failed to get endpoints from cluster status")
charms.mysql.v0.mysql.MySQLGetClusterEndpointsError: Failed to get endpoints from cluster status
unit-kfp-db-0: 01:36:18 INFO juju.worker.uniter.operation ran "database-peers-relation-changed" hook (via hook dispatching script: dispatch)
unit-kfp-db-0: 01:36:19 ERROR unit.kfp-db/0.juju-log database-peers:6: Failed to get cluster status for kfp-db-cluster
unit-kfp-db-0: 01:36:19 ERROR unit.kfp-db/0.juju-log database-peers:6: Failed to get cluster endpoints
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-kfp-db-0/charm/src/mysql_k8s_helpers.py", line 786, in update_endpoints
rw_endpoints, ro_endpoints, offline = self.get_cluster_endpoints(get_ips=False)
File "/var/lib/juju/agents/unit-kfp-db-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 724, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-kfp-db-0/charm/lib/charms/mysql/v0/mysql.py", line 1872, in get_cluster_endpoints
raise MySQLGetClusterEndpointsError("Failed to get endpoints from cluster status")
charms.mysql.v0.mysql.MySQLGetClusterEndpointsError: Failed to get endpoints from cluster status
unit-kfp-db-0: 01:36:19 INFO juju.worker.uniter.operation ran "database-peers-relation-changed" hook (via hook dispatching script: dispatch)
unit-kfp-db-0: 01:40:17 INFO unit.kfp-db/0.juju-log Unit workload member-state is online with member-role secondary
unit-kfp-db-0: 01:40:35 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-kfp-db-0: 01:40:37 INFO unit.kfp-db/0.juju-log database-peers:6: Starting the log rotate manager
unit-kfp-db-0: 01:40:37 INFO unit.kfp-db/0.juju-log database-peers:6: Started log rotate manager process with PID 1129
unit-kfp-db-0: 01:40:39 INFO juju.worker.uniter.operation ran "database-peers-relation-changed" hook (via hook dispatching script: dispatch)
unit-kfp-db-0: 01:40:43 INFO juju.worker.uniter.operation ran "database-peers-relation-changed" hook (via hook dispatching script: dispatch)
unit-kfp-db-0: 01:44:24 INFO unit.kfp-db/0.juju-log Unit workload member-state is online with member-role secondary
After the issue appeared 3 times we enabled the debug-log, but so far couldn't reproduce it with debug-log enabled.
Also there are no pod restarts appearing
This environment is running on Azure AKS cluster. We have 2 identical clusters deployed and the issue only happens on one of them. But it happens not only with kfp-db, but katib-db also.
Workaround to get the cluster member back into a healthy state is do delete the pod.
The text was updated successfully, but these errors were encountered:
Steps to reproduce
bundle.yaml.txt
Expected behavior
Mysql is able to get cluster endpoints
Actual behavior
Mysql is failing to get endpoints from cluster status
Versions
Operating system: Ubuntu 22.04.5 LTS
Juju CLI: 3.5.4-genericlinux-amd64
Juju agent: 3.5.4
Charm revision: 8.0/stable rev 180
Kubernetes version: 1.29.9
Log output
After the issue appeared 3 times we enabled the debug-log, but so far couldn't reproduce it with debug-log enabled.
Also there are no pod restarts appearing
Additional context
This environment is running on Azure AKS cluster. We have 2 identical clusters deployed and the issue only happens on one of them. But it happens not only with kfp-db, but katib-db also.
Workaround to get the cluster member back into a healthy state is do delete the pod.
The text was updated successfully, but these errors were encountered: