Error when launching Coordinator on kubernetes cluster #1942
karimDev13
started this conversation in
General
Replies: 1 comment
-
@lidongze0629 Could you please take a look? Thanks! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, i'm using a virtual machine to deploy graphscope, i install all the requirements and when i start a session i have this message :
2022-08-07 06:35:28,636 [INFO][session:684]: Initializing graphscope session with parameters: {'addr': None, 'mode': 'eager', 'cluster_type': 'k8s', 'num_workers': 2, 'preemptive': True, 'k8s_namespace': None, 'k8s_service_type': 'NodePort', 'k8s_gs_image': 'registry.cn-hongkong.aliyuncs.com/graphscope/graphscope:0.15.0', 'k8s_etcd_image': 'quay.io/coreos/etcd:v3.4.13', 'k8s_image_pull_policy': 'IfNotPresent', 'k8s_image_pull_secrets': [], 'k8s_coordinator_cpu': 1.5, 'k8s_coordinator_mem': '2Gi', 'etcd_addrs': None, 'k8s_etcd_num_pods': 1, 'k8s_etcd_cpu': 1.0, 'k8s_etcd_mem': '512Mi', 'k8s_vineyard_daemonset': 'none', 'k8s_vineyard_cpu': 0.2, 'k8s_vineyard_mem': '512Mi', 'vineyard_shared_mem': '4Gi', 'k8s_engine_cpu': 0.2, 'k8s_engine_mem': '1Gi', 'k8s_mars_worker_cpu': 0.2, 'k8s_mars_worker_mem': '4Mi', 'k8s_mars_scheduler_cpu': 0.2, 'k8s_mars_scheduler_mem': '2Mi', 'with_mars': False, 'reconnect': False, 'k8s_volumes': {'data': {'type': 'hostPath', 'field': {'path': '/test_data/', 'type': 'Directory'}, 'mounts': {'mountPath': '/testingdata'}}}, 'k8s_waiting_for_delete': False, 'timeout_seconds': 600, 'dangling_timeout_seconds': 600, 'mount_dataset': None, 'k8s_dataset_image': 'registry.cn-hongkong.aliyuncs.com/graphscope/dataset:0.15.0', 'k8s_client_config': {}}
2022-08-07 06:35:28,741 [INFO][cluster:303]: Launching coordinator...
2022-08-07 06:35:31,829 [INFO][utils:182]: coordinator-pmpiuw-68b6658888-dqqdj: Successfully assigned gs-vwnkam/coordinator-pmpiuw-68b6658888-dqqdj to kind-control-plane
2022-08-07 06:35:31,830 [INFO][utils:182]: coordinator-pmpiuw-68b6658888-dqqdj: Container image "registry.cn-hongkong.aliyuncs.com/graphscope/graphscope:0.15.0" already present on machine
2022-08-07 06:35:31,830 [INFO][utils:182]: coordinator-pmpiuw-68b6658888-dqqdj: Created container coordinator
2022-08-07 06:35:31,831 [INFO][utils:182]: coordinator-pmpiuw-68b6658888-dqqdj: Started container coordinator
2022-08-07 21:35:33,884 [INFO][cluster:695]: Launching etcd ...
2022-08-07 21:35:34,983 [INFO][cluster:906]: Etcd created, endpoint is 10.96.178.139:58225
2022-08-07 21:35:34,983 [INFO][cluster:927]: Creating interactive engine service...
2022-08-07 21:35:34,983 [INFO][cluster:855]: Launching zetcd proxy service ...
2022-08-07 21:35:34,984 [INFO][cluster:867]: zetcd cmd /home/graphscope/.local/lib/python3.8/site-packages/graphscope.runtime/bin/zetcd --zkaddr 0.0.0.0:2181 --endpoints http://gs-etcd-service-pmpiuw:58225,http://gs-etcd-pmpiuw-0:58225
Running zetcd proxy
Version: Version not provided (use make instead of go build)
SHA: SHA not provided (use make instead of go build)
2022-08-07 21:35:36,008 [INFO][cluster:896]: ZEtcd is ready, endpoint is 10.244.0.7:2181
2022-08-07 21:35:36,008 [INFO][cluster:934]: Creating engine replicaset...
2022-08-07 21:35:36,009 [INFO][cluster:534]: Launching GraphScope engines pod ...
2022-08-07 21:35:36,703 [INFO][cluster:987]: [gs-engine-pmpiuw-kjbt5]: Successfully assigned gs-vwnkam/gs-engine-pmpiuw-kjbt5 to kind-control-plane
2022-08-07 21:35:36,705 [INFO][cluster:987]: [gs-engine-pmpiuw-kjbt5]: MountVolume.SetUp failed for volume "data" : hostPath type check failed: /test_data/ is not a directory
2022-08-07 21:35:37,709 [INFO][cluster:987]: [gs-engine-pmpiuw-nfrqk]: Successfully assigned gs-vwnkam/gs-engine-pmpiuw-nfrqk to kind-control-plane
2022-08-07 21:35:37,711 [INFO][cluster:987]: [gs-engine-pmpiuw-nfrqk]: MountVolume.SetUp failed for volume "data" : hostPath type check failed: /test_data/ is not a directory
2022-08-07 06:35:42,178 [INFO][utils:182]: coordinator-pmpiuw-68b6658888-dqqdj: Readiness probe failed: dial tcp 10.244.0.7:59142: connect: connection refused
2022-08-07 21:37:39,734 [INFO][cluster:987]: [gs-engine-pmpiuw-kjbt5]: Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[default-token-bfxqc vineyard-ipc-volume host-shm data]: timed out waiting for the condition
2022-08-07 21:37:40,740 [INFO][cluster:987]: [gs-engine-pmpiuw-nfrqk]: Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[vineyard-ipc-volume host-shm data default-token-bfxqc]: timed out waiting for the condition
2022-08-07 21:42:14,990 [INFO][cluster:987]: [gs-engine-pmpiuw-kjbt5]: Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[vineyard-ipc-volume host-shm data default-token-bfxqc]: timed out waiting for the condition
2022-08-07 21:44:30,965 [INFO][cluster:987]: [gs-engine-pmpiuw-kjbt5]: Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[host-shm data default-token-bfxqc vineyard-ipc-volume]: timed out waiting for the condition
Traceback (most recent call last):
File "/home/karim/.local/lib/python3.7/site-packages/graphscope/deploy/kubernetes/cluster.py", line 569, in start
self._waiting_for_services_ready()
File "/home/karim/.local/lib/python3.7/site-packages/graphscope/deploy/kubernetes/cluster.py", line 500, in _waiting_for_services_ready
timeout_seconds=self._saved_locals["timeout_seconds"],
File "/home/karim/.local/lib/python3.7/site-packages/graphscope/deploy/kubernetes/utils.py", line 137, in wait_for_deployment_complete
raise TimeoutError("Waiting timeout for deployment {}".format(name))
TimeoutError: Waiting timeout for deployment coordinator-pmpiuw
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "session.py", line 21, in
sess = graphscope.session(k8s_volumes=k8s_volumes)
File "/home/karim/.local/lib/python3.7/site-packages/graphscope/client/utils.py", line 357, in wrapper
return_value = func(*args, **kwargs)
File "/home/karim/.local/lib/python3.7/site-packages/graphscope/client/session.py", line 716, in init
self._connect()
File "/home/karim/.local/lib/python3.7/site-packages/graphscope/client/session.py", line 1041, in _connect
self._launcher.start()
File "/home/karim/.local/lib/python3.7/site-packages/graphscope/deploy/kubernetes/cluster.py", line 581, in start
) from e
graphscope.framework.errors.K8sError: Error when launching Coordinator on kubernetes cluster
Beta Was this translation helpful? Give feedback.
All reactions