drain_after_revoke failed due to killed process #117

yordis · 2022-09-02T19:33:34Z

I am receiving the following error in Sentry:

Sentry.CrashError: ** (exit) exited in: GenServer.call(#PID<0.5095.0>, :drain_after_revoke, :infinity)
    ** (EXIT) killed
  File "lib/gen_server.ex", line 1030, in GenServer.call/3
  File "lib/broadway_kafka/producer.ex", line 525, in anonymous fn/2 in BroadwayKafka.Producer.assignments_revoked/1
  File "/opt/app/deps/telemetry/src/telemetry.erl", line 320, in :telemetry.span/3
  File "/opt/app/deps/brod/src/brod_group_coordinator.erl", line 502, in :brod_group_coordinator.stabilize/3
  File "/opt/app/deps/brod/src/brod_group_coordinator.erl", line 416, in :brod_group_coordinator.handle_info/2
  File "gen_server.erl", line 695, in :gen_server.try_dispatch/4
  File "gen_server.erl", line 771, in :gen_server.handle_msg/6
  File "proc_lib.erl", line 226, in :proc_lib.init_p_do_apply/3

Coming from

broadway_kafka/lib/broadway_kafka/producer.ex

Line 525 in 271464f

GenStage.call(producer_pid, :drain_after_revoke, :infinity)

I wondering if we should catch the error and return :ok here.

thoughts?

The text was updated successfully, but these errors were encountered:

slashmili · 2022-09-05T10:55:11Z

When a new consumer is joining the consumer group, Kafka asks all the consumers to stop what they are doing and join the new generation(hence drain_after_revoke call)

At the same time your erlang node is trying stop all the processes as the deployment is triggering that.

~~I think what is happening here is that your broadway consumers are not finishing the job on time and the beam is killing them forcefully.~~
~~Edit1: What I wrote here doesn't make sense since broadway consumers are independent of the producer process.~~
Edit2: What I said originally make sense, the producer waits for all the handover jobs to be finished before returning to handle_call

I'd suggest to measure the consumption time for your messages using telemetry. If they are low(~20-30 milliseconds) it could be that the dispatcher is overloaded

yordis · 2022-09-07T02:39:55Z

Maybe related to?

josevalim · 2023-03-14T17:09:01Z

We have pushed several improvements here, including a just published new version. Please let us know if the error persists!

josevalim closed this as completed Mar 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

drain_after_revoke failed due to killed process #117

drain_after_revoke failed due to killed process #117

yordis commented Sep 2, 2022

slashmili commented Sep 5, 2022 •

edited

Loading

yordis commented Sep 7, 2022

josevalim commented Mar 14, 2023

drain_after_revoke failed due to killed process #117

drain_after_revoke failed due to killed process #117

Comments

yordis commented Sep 2, 2022

slashmili commented Sep 5, 2022 • edited Loading

yordis commented Sep 7, 2022

josevalim commented Mar 14, 2023

slashmili commented Sep 5, 2022 •

edited

Loading