Skip to content

[FanOutRecordsPublisher] ClosedChannelException WARN logs with Netty HTTP2 client (KCL 3.1.x) #1605

@anjeongkyun

Description

@anjeongkyun

Summary

When running with Spring Cloud Stream Kinesis Binder (4.0.4) using KCL 3.1.x under JDK 21, I observed repeated ClosedChannelException warnings caused by Netty HTTP2 channel closures.
KCL recovers by recreating subscriptions, but the WARN logs keep repeating, creating operational noise.

Logs (excerpt)

2025-09-21 14:44:04 WARN s.a.k.r.f.FanOutRecordsPublisher:350 - shardId-000000000003: [SubscriptionLifetime] - (FanOutRecordsPublisher#errorOccurred) ...
java.io.IOException: An error occurred on the connection: java.nio.channels.ClosedChannelException
at software.amazon.awssdk.http.nio.netty.internal.http2.MultiplexedChannelRecord.decorateConnectionException(...)
...
2025-09-21 14:44:05 WARN s.a.k.l.ShardConsumerSubscriber:115 - shardId-000000000003: Failure occurred in retrieval. Restarting data requests

Environment

  • Spring Boot: 3.3.5
  • Spring Cloud Stream Binder for Kinesis: 4.0.4
  • Spring Integration AWS: 3.0.8
  • Amazon Kinesis Client (KCL): 3.1.x (pulled transitively by the binder)
  • JDK: 21
  • Fan-Out: enabled

KinesisAsyncClient Bean

@Bean
@Primary
open fun kinesisClient(): KinesisAsyncClient {
    val httpClient =
        NettyNioAsyncHttpClient.builder()
            .protocol(Protocol.HTTP2)
            .http2Configuration { it.healthCheckPingPeriod(Duration.ofSeconds(60)) }
            .connectionMaxIdleTime(Duration.ofSeconds(300))
            .build()

    return KinesisAsyncClient.builder()
        .httpClient(httpClient)
        .region(Region.of(region))
        .apply { endpoint?.let { endpointOverride(URI.create(it)) } }
        .build()
}

Spring Cloud Stream configuration (excerpt)

spring:
  cloud:
    stream:
      bindings:
        handlePurchaseDomainEvent-in-0:
          group: shared-purchase-event-consumer
          destination: shared-purchase-domain-event-stream
          consumer:
            concurrency: 4
        handlePrepaidCardDomainEvent-in-0:
          group: shared-prepaid-card-event-consumer
          destination: shared-prepaid-card-domain-event-stream
          consumer:
            concurrency: 4
      kinesis:
        binder:
          auto-create-stream: false
          kpl-kcl-enabled: true
          checkpoint:
            table: sharedPurchaseIntegrationMetadataStore
          locks:
            table: sharedPurchaseIntegrationLockRegistry
          enable-observation: false
        bindings:
          handlePurchaseDomainEvent:
            consumer:
              fan-out: true
              metrics-level: NONE
              checkpointMode: batch
              idle-between-polls: 250
              consumerBackoff: 100
              gracefulShutdownTimeout: 20000
          handlePrepaidCardDomainEvent:
            consumer:
              fan-out: true
              metrics-level: NONE
              checkpointMode: batch
              idle-between-polls: 250
              consumerBackoff: 100
              gracefulShutdownTimeout: 20000

Observed Behavior

WARN logs for ClosedChannelException repeat frequently when using Fan-Out consumers.
KCL retries and continues consuming, but the WARN noise persists.

Expected Behavior

Channel closures should be handled more gracefully.
Either recover silently, or downgrade the log level to INFO/DEBUG unless data loss occurs.

Impact

No obvious data loss, but WARN spam makes logs noisy.
Operational monitoring systems treat this as an error, making it difficult to distinguish critical issues.

Additional Context

This appears to be triggered when Http2MultiplexedChannelPool closes parent channels, propagating exceptions to all child channels.
Might be related to idle connection management in Netty + AWS SDK v2.

Questions

  • Is this behavior expected in the current design of FanOutRecordsPublisher / ShardConsumerSubscriber?
    I would like to better understand which part of the connection handling logic is responsible for repeatedly logging these ClosedChannelExceptions.
  • If this is considered an improvement opportunity (for example, adjusting the log level or handling channel closure more gracefully), I am interested in contributing a fix.
    Could you please advise whether such a contribution would be welcome, and which area of the codebase would be most relevant to look into?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions