Skip to content

[Bug] Pulsar-function BK read-retry-loop with no backoff cause CPU exhaution #24677

@vroyer

Description

@vroyer

Search before reporting

  • I searched in the issues and found nothing similar.

Read release policy

  • I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.

User environment

We are running an HELM deployed pulsar cluster running on an on-prem kubernetes (lunastreaming 4.1.3.18, https://github.com/datastax/pulsar/tree/ls31_4.18 ) with 3 bookies + 3 brokers.

Issue Description

When a bookie is unavailable because of a k8s hardware issue (1 out of 3 bookies with quorum=2), the pulsar-function try to read some metadata from bookkeeper and the unavilable bookie cause a very CPU expensive read retry loop. As the result, the pulsar-function liveness check (curl on http://localhost:6670/metrics/) fails and k8s kills the pulsar-function pod.

4m57s       Warning   Unhealthy          pod/pulsar-function-0                        Liveness probe failed:

Meanwhile, some pulsar connector pods cannot start properly, which is not acceptable for a production usage when losing 1 bookie over 3.

Error messages

2025-08-28T07:46:59,657+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to bookie: null/glpdlskub016:3181, current state CONNECTING :
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
	at org.apache.bookkeeper.client.DefaultBookieAddressResolver.resolve(DefaultBookieAddressResolver.java:69) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.proto.PerChannelBookieClient.connect(PerChannelBookieClient.java:540) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.proto.PerChannelBookieClient.connectIfNeededAndDoOp(PerChannelBookieClient.java:682) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.proto.DefaultPerChannelBookieClientPool.obtain(DefaultPerChannelBookieClientPool.java:120) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.proto.DefaultPerChannelBookieClientPool.obtain(DefaultPerChannelBookieClientPool.java:115) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.proto.BookieClientImpl.readEntry(BookieClientImpl.java:499) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.proto.BookieClientImpl.readEntry(BookieClientImpl.java:485) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.proto.BookieClientImpl.readEntry(BookieClientImpl.java:479) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.client.PendingReadOp.sendReadTo(PendingReadOp.java:583) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.sendNextRead(PendingReadOp.java:408) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.read(PendingReadOp.java:389) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.client.PendingReadOp.initiate(PendingReadOp.java:533) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.client.PendingReadOp.run(PendingReadOp.java:543) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.client.LedgerHandle.readEntriesInternalAsync(LedgerHandle.java:896) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.client.LedgerHandle.asyncReadEntriesInternal(LedgerHandle.java:798) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.client.LedgerHandle.asyncReadEntries(LedgerHandle.java:692) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.distributedlog.impl.logsegment.BKLogSegmentEntryReader.issueSimpleRead(BKLogSegmentEntryReader.java:632) ~[org.apache.distributedlog-distributedlog-core-4.16.7.jar:4.16.7]
	at org.apache.distributedlog.impl.logsegment.BKLogSegmentEntryReader.issueRead(BKLogSegmentEntryReader.java:616) ~[org.apache.distributedlog-distributedlog-core-4.16.7.jar:4.16.7]
	at org.apache.distributedlog.impl.logsegment.BKLogSegmentEntryReader.access$200(BKLogSegmentEntryReader.java:62) ~[org.apache.distributedlog-distributedlog-core-4.16.7.jar:4.16.7]
	at org.apache.distributedlog.impl.logsegment.BKLogSegmentEntryReader$CacheEntry.run(BKLogSegmentEntryReader.java:243) ~[org.apache.distributedlog-distributedlog-core-4.16.7.jar:4.16.7]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131) ~[com.google.guava-guava-32.1.1-jre.jar:?]
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:75) ~[com.google.guava-guava-32.1.1-jre.jar:?]
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82) ~[com.google.guava-guava-32.1.1-jre.jar:?]
	at org.apache.bookkeeper.common.util.SingleThreadSafeScheduledExecutorService$SafeRunnable.run(SingleThreadSafeScheduledExecutorService.java:46) ~[org.apache.bookkeeper-bookkeeper-common-4.16.7.jar:4.16.7]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.netty-netty-common-4.1.122.Final.jar:4.1.122.Final]
	at java.lang.Thread.run(Thread.java:840) ~[?:?]
Caused by: org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
	at org.apache.bookkeeper.discover.ZKRegistrationClient.getBookieServiceInfo(ZKRegistrationClient.java:249) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	at org.apache.bookkeeper.client.DefaultBookieAddressResolver.resolve(DefaultBookieAddressResolver.java:47) ~[org.apache.bookkeeper-bookkeeper-server-4.16.7.jar:4.16.7]
	... 31 more
2025-08-28T07:46:59,658+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E1 from bookie: glpdlskub016:3181
2025-08-28T07:46:59,658+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E1-E1, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:46:59,857+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:46:59,857+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:46:59,857+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E2 from bookie: glpdlskub016:3181
2025-08-28T07:46:59,857+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E2-E2, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:00,057+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:00,057+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:00,057+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E3 from bookie: glpdlskub016:3181
2025-08-28T07:47:00,057+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E3-E3, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:00,257+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:00,257+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:00,257+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E0 from bookie: glpdlskub016:3181
2025-08-28T07:47:00,257+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E0-E0, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:00,403+0000 [function-web-27-9] INFO  org.eclipse.jetty.server.RequestLog - 127.0.0.1 - - [28/Aug/2025:07:47:00 +0000] "GET /metrics/ HTTP/1.1" 200 19970 "-" "curl/7.81.0" 14
2025-08-28T07:47:00,458+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:00,459+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:00,459+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E1 from bookie: glpdlskub016:3181
2025-08-28T07:47:00,459+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E1-E1, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:00,658+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:00,658+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:00,658+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E2 from bookie: glpdlskub016:3181
2025-08-28T07:47:00,658+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E2-E2, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:00,858+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:00,858+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:00,858+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E3 from bookie: glpdlskub016:3181
2025-08-28T07:47:00,859+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E3-E3, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:01,058+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:01,058+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:01,058+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E0 from bookie: glpdlskub016:3181
2025-08-28T07:47:01,058+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E0-E0, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:01,259+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:01,260+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:01,260+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E1 from bookie: glpdlskub016:3181
2025-08-28T07:47:01,260+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E1-E1, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:01,459+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:01,459+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:01,459+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E2 from bookie: glpdlskub016:3181
2025-08-28T07:47:01,459+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E2-E2, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:01,659+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:01,659+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:01,660+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E3 from bookie: glpdlskub016:3181
2025-08-28T07:47:01,660+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E3-E3, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:01,859+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:01,859+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:01,859+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E0 from bookie: glpdlskub016:3181
2025-08-28T07:47:01,859+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E0-E0, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:02,061+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:02,061+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:02,061+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E1 from bookie: glpdlskub016:3181
2025-08-28T07:47:02,061+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E1-E1, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:02,260+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:02,260+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:02,260+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E2 from bookie: glpdlskub016:3181
2025-08-28T07:47:02,260+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E2-E2, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:02,460+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:02,461+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:02,461+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E3 from bookie: glpdlskub016:3181
2025-08-28T07:47:02,461+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E3-E3, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:02,660+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:02,660+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:02,660+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E0 from bookie: glpdlskub016:3181
2025-08-28T07:47:02,660+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E0-E0, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:02,862+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:02,862+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:02,862+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E1 from bookie: glpdlskub016:3181
2025-08-28T07:47:02,862+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E1-E1, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:03,061+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:03,061+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:03,061+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E2 from bookie: glpdlskub016:3181
2025-08-28T07:47:03,061+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E2-E2, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:03,262+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:03,262+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:03,262+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E3 from bookie: glpdlskub016:3181
2025-08-28T07:47:03,262+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E3-E3, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:03,461+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:03,461+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running
2025-08-28T07:47:03,461+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L903175 E0 from bookie: glpdlskub016:3181
2025-08-28T07:47:03,461+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L903175 E0-E0, Sent to [glpdlskub016:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
2025-08-28T07:47:03,663+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve glpdlskub016:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-08-28T07:47:03,663+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to glpdlskub016:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running

Reproducing the issue

Additional information

This read-retry-loop with no backoff for an unavailable bookie seems to be a bookkeeper issue (bk version 4.16.7), because there is even no BK read backoff setting to mitigate this kind of situation, it keeps retring every 200ms. It is useless and expensive to immediately retry reading from a bookie with this error: "Cannot resolve bookieId glpdlskub016:3181, bookie does not exist or it is not running"

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugThe PR fixed a bug or issue reported a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions