You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
Please do not modify this template :) and fill in all the required fields.
Dify version
0.12.1
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
I deploy dify images in kubernetes cluster, I start 5 dify-worker pods, and I use flower to monitor the status of worker nodes.
when I started all containers, it ran well. But when I upload a document to build index dict, the node which got this task become offline and not recovered. At the same time, I can found the logs in my log console:
# podname: dify-worker-6cb6dc8697-bp8q2
2025-01-03 10:59:24.453 INFO [MainThread] [connection.py:22] - Connected to redis://:**@redis-cluster-master.dify-ai.svc.cluster.local:6379/1
2025-01-03 10:59:24.459 INFO [MainThread] [mingle.py:40] - mingle: searching for neighbors
2025-01-03 10:59:25.824 INFO [MainThread] [mingle.py:43] - mingle: sync with 1 nodes
2025-01-03 10:59:25.826 INFO [MainThread] [mingle.py:47] - mingle: sync complete
2025-01-03 10:59:25.856 INFO [Dummy-3] [pidbox.py:111] - pidbox: Connected to redis://:**@redis-cluster-master.dify-ai.svc.cluster.local:6379/1.
2025-01-03 10:59:25.867 INFO [MainThread] [worker.py:175] - celery@dify-worker-6cb6dc8697-bp8q2 ready.
2025-01-03 10:59:26.947 INFO [Dummy-3] [control.py:342] - Events of group {task} enabled by remote.
2025-01-03 10:59:34.291 INFO [Dummy-3] [control.py:375] - sync with celery@dify-worker-6cb6dc8697-lmjh2
2025-01-03 10:59:38.351 INFO [Dummy-3] [control.py:375] - sync with celery@dify-worker-6cb6dc8697-wr8qj
2025-01-03 10:59:40.604 INFO [Dummy-3] [control.py:375] - sync with celery@dify-worker-6cb6dc8697-6fdnr
2025-01-03 11:11:35.096 INFO [MainThread] [strategy.py:161] - Task tasks.ops_trace_task.process_trace_tasks[f87d18b4-ccad-452c-abb8-e36c0d38a6e2] received
2025-01-03 11:11:35.915 INFO [Dummy-5] [ops_trace_task.py:48] - Processing trace tasks success, app_id: 9a71b7b7-7c8b-407b-b92b-93d425b16157
2025-01-03 11:11:35.995 INFO [Dummy-5] [trace.py:128] - Task tasks.ops_trace_task.process_trace_tasks[f87d18b4-ccad-452c-abb8-e36c0d38a6e2] succeeded in 0.8925492763519287s: None
2025-01-03 11:11:40.037 INFO [MainThread] [strategy.py:161] - Task tasks.ops_trace_task.process_trace_tasks[4076ad1a-1d1e-48a8-9a12-66b0be8abc59] received
2025-01-03 11:11:40.399 INFO [Dummy-8] [ops_trace_task.py:48] - Processing trace tasks success, app_id: 9a71b7b7-7c8b-407b-b92b-93d425b16157
2025-01-03 11:11:40.473 INFO [Dummy-8] [trace.py:128] - Task tasks.ops_trace_task.process_trace_tasks[4076ad1a-1d1e-48a8-9a12-66b0be8abc59] succeeded in 0.4311293810606003s: None
2025-01-03 11:23:36.277 INFO [MainThread] [strategy.py:161] - Task tasks.ops_trace_task.process_trace_tasks[b7819f67-153b-40c7-b752-e1fd99946dc2] received
# podname:dify-worker-6cb6dc8697-6fdnr
2025-01-03 10:59:40.575 INFO [MainThread] [connection.py:22] - Connected to redis://:**@redis-cluster-master.dify-ai.svc.cluster.local:6379/1
2025-01-03 10:59:40.582 INFO [MainThread] [mingle.py:40] - mingle: searching for neighbors
2025-01-03 10:59:41.613 INFO [MainThread] [mingle.py:43] - mingle: sync with 4 nodes
2025-01-03 10:59:41.614 INFO [MainThread] [mingle.py:47] - mingle: sync complete
2025-01-03 10:59:41.643 INFO [MainThread] [worker.py:175] - celery@dify-worker-6cb6dc8697-6fdnr ready.
2025-01-03 10:59:41.650 INFO [Dummy-3] [pidbox.py:111] - pidbox: Connected to redis://:**@redis-cluster-master.dify-ai.svc.cluster.local:6379/1.
2025-01-03 10:59:41.944 INFO [Dummy-3] [control.py:342] - Events of group {task} enabled by remote.
2025-01-03 11:12:33.123 INFO [MainThread] [strategy.py:161] - Task tasks.ops_trace_task.process_trace_tasks[aff71b6d-b733-4492-ae48-f6d4d5ba2a50] received
2025-01-03 11:12:33.732 INFO [Dummy-5] [ops_trace_task.py:48] - Processing trace tasks success, app_id: 9a71b7b7-7c8b-407b-b92b-93d425b16157
2025-01-03 11:12:33.783 INFO [Dummy-5] [trace.py:128] - Task tasks.ops_trace_task.process_trace_tasks[aff71b6d-b733-4492-ae48-f6d4d5ba2a50] succeeded in 0.6574982842430472s: None
2025-01-03 11:14:07.699 INFO [MainThread] [strategy.py:161] - Task tasks.ops_trace_task.process_trace_tasks[cfd961eb-2b3b-4d15-94ae-fee61f54c30c] received
2025-01-03 11:14:08.106 INFO [Dummy-8] [ops_trace_task.py:48] - Processing trace tasks success, app_id: 9a71b7b7-7c8b-407b-b92b-93d425b16157
2025-01-03 11:14:08.223 INFO [Dummy-8] [trace.py:128] - Task tasks.ops_trace_task.process_trace_tasks[cfd961eb-2b3b-4d15-94ae-fee61f54c30c] succeeded in 0.5212409980595112s: None
2025-01-03 11:15:04.701 INFO [MainThread] [strategy.py:161] - Task tasks.ops_trace_task.process_trace_tasks[fbf14db5-c761-4fb6-a1e1-6aeab3d1106e] received
2025-01-03 11:15:05.049 INFO [Dummy-11] [ops_trace_task.py:48] - Processing trace tasks success, app_id: 9a71b7b7-7c8b-407b-b92b-93d425b16157
2025-01-03 11:15:05.108 INFO [Dummy-11] [trace.py:128] - Task tasks.ops_trace_task.process_trace_tasks[fbf14db5-c761-4fb6-a1e1-6aeab3d1106e] succeeded in 0.40471500158309937s: None
2025-01-03 11:15:42.626 INFO [MainThread] [strategy.py:161] - Task tasks.ops_trace_task.process_trace_tasks[d0609a4a-a704-446c-a745-c6795f73e8d5] received
2025-01-03 11:15:42.911 INFO [Dummy-14] [ops_trace_task.py:48] - Processing trace tasks success, app_id: 9a71b7b7-7c8b-407b-b92b-93d425b16157
2025-01-03 11:15:43.651 INFO [Dummy-14] [trace.py:128] - Task tasks.ops_trace_task.process_trace_tasks[d0609a4a-a704-446c-a745-c6795f73e8d5] succeeded in 1.0220515485852957s: None
2025-01-03 11:16:50.840 INFO [MainThread] [strategy.py:161] - Task tasks.ops_trace_task.process_trace_tasks[65715d29-f806-449e-82b6-671011214006] received
2025-01-03 11:16:51.130 INFO [Dummy-17] [ops_trace_task.py:48] - Processing trace tasks success, app_id: 9a71b7b7-7c8b-407b-b92b-93d425b16157
2025-01-03 11:16:51.219 INFO [Dummy-17] [trace.py:128] - Task tasks.ops_trace_task.process_trace_tasks[65715d29-f806-449e-82b6-671011214006] succeeded in 0.377222154289484s: None
2025-01-03 11:17:16.004 INFO [MainThread] [strategy.py:161] - Task tasks.ops_trace_task.process_trace_tasks[cf306ca0-5869-4de3-9bfa-84e1a1f635a3] received
2025-01-03 11:17:16.338 INFO [Dummy-20] [ops_trace_task.py:48] - Processing trace tasks success, app_id: 9a71b7b7-7c8b-407b-b92b-93d425b16157
2025-01-03 11:17:16.411 INFO [Dummy-20] [trace.py:128] - Task tasks.ops_trace_task.process_trace_tasks[cf306ca0-5869-4de3-9bfa-84e1a1f635a3] succeeded in 0.40508390963077545s: None
2025-01-03 11:23:41.830 INFO [Dummy-22] [gossip.py:145] - missed heartbeat from celery@dify-worker-6cb6dc8697-bp8q2
2025-01-03 11:24:56.838 INFO [Dummy-23] [gossip.py:145] - missed heartbeat from celery@dify-worker-6cb6dc8697-fwnsq
2025-01-03 11:32:51.891 INFO [Dummy-24] [gossip.py:145] - missed heartbeat from celery@dify-worker-6cb6dc8697-wr8qj
2025-01-03 11:34:45.596 INFO [MainThread] [strategy.py:161] - Task tasks.ops_trace_task.process_trace_tasks[c09875e9-ff7d-45d8-bbca-47ee82b97746] received
the screenshot of flower
the workers which got task was all offline and only the workers which not got task was online.
✔️ Expected Behavior
the workers should always be online to accept the task to build the index dict.
❌ Actual Behavior
the workers which got tasks become offline and cannot recover to accept the new task.
The text was updated successfully, but these errors were encountered:
Self Checks
Dify version
0.12.1
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
I deploy dify images in kubernetes cluster, I start 5 dify-worker pods, and I use flower to monitor the status of worker nodes.
when I started all containers, it ran well. But when I upload a document to build index dict, the node which got this task become offline and not recovered. At the same time, I can found the logs in my log console:
the screenshot of flower
the workers which got task was all offline and only the workers which not got task was online.
✔️ Expected Behavior
the workers should always be online to accept the task to build the index dict.
❌ Actual Behavior
the workers which got tasks become offline and cannot recover to accept the new task.
The text was updated successfully, but these errors were encountered: