Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate records when stream throughput limit exceeded #249

Open
adrian-skybaker opened this issue Sep 15, 2022 · 3 comments
Open

Duplicate records when stream throughput limit exceeded #249

adrian-skybaker opened this issue Sep 15, 2022 · 3 comments

Comments

@adrian-skybaker
Copy link

I'm finding that I see significant numbers of duplicates if I hit throttling on the kinesis stream.

Obviously I realise I want to avoid throttling, but I'm wondering this is expected behaviour? For example, I would expect that even when batching, the plugin would only retry the failed parts of the batch.

If this is not expected, happy to provide more logging if that's helpful (below is warning level and above).

This is using amazon/aws-for-fluent-bit:init-2.28.1 .

Log sample:

2022-09-15T17:47:17.678+12:00 | time="2022-09-15T05:47:17Z" level=warning msg="[kinesis 0] 1/2 records failed to be delivered. Will retry.\n"
2022-09-15T17:47:17.678+12:00 | time="2022-09-15T05:47:17Z" level=warning msg="[kinesis 0] Throughput limits for the stream may have been exceeded."
2022-09-15T17:47:19.103+12:00 | [2022/09/15 05:47:19] [ warn] [engine] failed to flush chunk '1-1663220835.534380470.flb', retry in 11 seconds: task_id=1, input=forward.1 > output=kinesis.1 (out_id=1)
[OUTPUT]
    Name kinesis
    Match service-firelens*
    region ${AWS_REGION}
    stream my-stream-name
    aggregation true
    partition_key container_id
    compression gzip
@adrian-skybaker adrian-skybaker changed the title Duplicate records when stream throughput limit exceededd Duplicate records when stream throughput limit exceeded Sep 15, 2022
@adrian-skybaker
Copy link
Author

fluent/fluent-bit#2159 (comment) may be relevant.

@ashraf133
Copy link

Hello
do you encouter this error also ?

time="2023-11-17T13:23:29Z" level=error msg="[kinesis 0] The partition key could not be found in the record, using a random string instead"

@jryberg
Copy link

jryberg commented Nov 24, 2023

Hello do you encouter this error also ?

time="2023-11-17T13:23:29Z" level=error msg="[kinesis 0] The partition key could not be found in the record, using a random string instead"

You need to make sure that you actually have defined partition_key as a key in your log messages

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants